Blog Note:

The latest installment will always show up at the top of this blog, but to read the entries sequentially (recommended), start with the introduction just below the latest post and read down from there.

Experimental Error Does Not Mean 'Mistake' Doug

It's interesting (and frustrating) watching folk like Doug use high tech methods of distribution to ignorantly mock the scientific method. Without the scientific method of course, the high tech distribution channels he uses wouldn't even exist.  Nor would modern transportation, medicine, agriculture, shelter and infrastructure just to name a few.  A student of history will notice immediately the correlation between the growth of the scientific method and the growth of modern civilization.  Causation is also rationally inarguable.

But here's Dougie -- using his high tech platform, created wholly by the scientific method to misuse and berate the very platform upon which his ministry rests.  All in a days work for someone who talks a lot but doesn't know much.

In this installment I'm going to use an accusation made by Batchelor on his website against Willard Libby (the inventor of carbon dating) to discuss both the scientific principles involved in “experimental error”, and relate a brief version of an interesting story involving those principles.

Here is the website link to the accusation made by Batchelor.

And the relevant Batchelor quote from the above link:
“In science experiments, assumptions are critical. But if the starting assumption is false, the ensuing experiment will lead a scientist to draw a flawed conclusion, even if his calculations appear correct. Willard Libby, the developer of carbon dating, drew his conclusions based on the assumption that the earth was millions of years old. He calculated that it would take about 30,000 years for an atmosphere’s 14C/12C ratio to reach equilibrium. When he discovered that earth’s ratio was not in equilibrium, meaning it must be younger than 30,000 years, he dismissed it as an experimental error!”
There are so many things wrong with this paragraph - but like so many other YEC (Young Earth Creationist) canards this one gets repeated and repeated and the gullible and uninformed lap it up because it sounds so good. It's historically inaccurate and it contains flawed logic but we'll deal with both of those in a later installment. All I'm going to address in this round is Batchelor's creative and ignorant use of the scientific terms "assumption" and “experimental error” and show how this misuse gives an even greater false tone to what already is a historically false story.

Batchelor here uses the term to mean “Libby's results told him the earth was younger than 30,000 years but because of his stubborn assumption that the earth was old, he was dismissive of those results on the basis that the results of the experiment were a mistake.” This is simply not true and the process of the discovery and development of radiocarbon dating is well documented - including Libby's standard and scientifically proper use of the terms "assumption" and “experimental error”.

Before I get to Libby's own published words from the early era of radiocarbon research – words that will give lie to Batchelor's assertion, let's first make sure we understand how the scientific community uses the terms “assumption” and “experimental error”.

First, let's deal with assumptions:

In science, assumptions are stipulations based on the best available knowledge. Each of us use assumptions almost every moment of our everyday lives. We assume that the light will go on when we flick the bedside lamp switch and we base our nighttime behaviors on this assumption. We assume that we will be able to recognize that this assumption has failed if it does fail (no light) and we might keep a flashlight handy just in case. We assume the same of the flashlight. So on and so forth. Like those you use everyday, with few exceptions assumptions in science have been tested and can be made with confidence - confidence that they will work and that when they don't work, we can recognize that they have failed.

When in science might one be allowed to use untested assumptions? In the context of a hypothesis. When one creates a scientific hypothesis, one might need at the start to make assumptions that are untested - and then test them as a matter of course.

Scientific hypothesis are the initial building blocks of science. Someone comes up with an idea – often not much more than an educated guess based on prior knowledge and observation. The key to such a hypothesis is that it must have no predetermined outcome and it must be something that can be refuted or supported through observation and experimentation. It must be both testable and falsifiable - it must be able to be proven wrong.

The core of hypothesis are new ideas and assumptions – and in the presentation of a new hypothesis, one or more untested assumptions might be made. When used in conjunction with a hypothesis, there is no negative connotation attached to an untested assumption – it is not a product of stubborn nature or laziness, but rather a product of publicly acknowledged ignorance.

Untested assumptions are unavoidable at the start, but the only way a hypothesis moves forward towards any possibility of confirmation and practical usage is through the replacement of those untested assumptions one by one with solid demonstrated observations and data. Batchelor attempts to attach a pejorative stench to the word “assumption” that simply isn't there in the context of a hypothesis.

Here is a great link regarding scientific assumptions:

And now let's talk about the term "experimental error" and how it's used in science.

Scientists in their professional work also don't use the word "error" as one does in common language where it might refer to a typo or an unforced error in tennis. In science those are just mistakes. In the scientific sense the term “error” is generally referring to the difference between a measured value and an actual value. These errors can be myriad and diverse in cause. There are standard formula for expressing these experimental errors.

In science there is no perfection – science recognizes that it doesn't exist. Every single measurement has a rate of error. Experiments generally involve a number of different measurement and these errors accumulate into what is often called an “error stack” (as in 'error on top of error'). Some types of measurements can have an error stack all their own that compounds upon themselves and that then get added to the total error stack of the experiment (we call this a 'compounding error' and it's nasty as we'll see). If this sounds like a whole slew of errors – it certainly can be and often is. The reduction of these error stacks is critical to science and why so many repeats and double checks and independent tests and sources and scientists are involved in a good science project before it is considered confirmed.

Let's use an automobile example to illustrate an error stack and to highlight the difference between simple errors and compound errors. Additionally we will show just how destructive error propagation can be to accuracy.

Every gear, every bearing, every connection in the driveline of your car has some small amount of clearance built into it. If not for these small clearances, your car would bind up and not move. In gears for instance, this clearance is called “gear lash”. When considering your car's driveline from one end to the other, there are quite a number of gears and each bit of gear lash is added to the next. When added together along with bearing clearances, etc., we get in total what is called “driveline lash”.

You can quantify the amount of driveline lash in your car by parking it with the brakes off and the transmission engaged (use wheel chocks please). Jack one drive wheel off the ground and attempt to turn the wheel forward and reverse. You will be able to move it some small amount back and forth easily – that is the total driveline lash. Newer cars will likely have less, older cars more. For the sake of this discussion we're going to say we have 1 inch of driveline lash – that is if the car were back on the ground, we could push the entire car back and forth one inch utilizing this lash.

Now, let's think about our odometer on this car. Let's drive the car for what the odometer says is one mile and park the car. Assuming all other components of the odometer are perfectly accurate, are we exactly one mile away from where we started? Well, because of driveline lash we don't know – remember that we can move the car one inch forward and back when it's parked without the odometer moving. The combined lash of your drivetrain is an 'error stack' impeding the accuracy of your odometer. Every little error from each gear and bearing has added up to an inch variation and a scientist would say that you have traveled 1 mile(+/- 1 inch). That's one mile, plus or minus one inch of error.

Notice the usage of the term error and how it does not mean a mistake of any kind was made. We know we need clearance in the drive train – it's there intentionally, yet in science it's still called an error.

The nice thing about the above drivetrain lash 'error', is that it doesn't compound on itself – it's an error that doesn't propagate. We can drive a hundred miles, park the car and we can say that we have traveled 100 miles(+/- 1 inch).

Now let's change this error up and see what happens. Let's take our car and put tires on it that have a circumference 1 inch bigger than the odometer is calibrated for. Let's additionally suppose that this 1 inch amounts to 1% of the total circumference of the tire. Now every time the odometer claims we have driven 100 inches over the ground, we have actually driven 101 inches. Just like the driveline lash it's only a 1 inch error, but it's the sort of error that happens over and over – adding to itself every time the tire rolls one revolution. If we park our car after the odometer reads one mile, we are actually almost 53 feet beyond a mile. If we drive a hundred miles as read by the odometer we are a full mile past that distance and after a thousand miles, we have accrued 10 miles of error. (Our drivetrain lash error? – it's still only +/1 one inch total.)

That second sort of error is a 'compound error' or a propagating error. The bad news is that they add up fast. The good news is that they add up fast (so they are often easy to spot).

Some errors are systematic in nature – for example a simple experiment calculating the value of gravity by swinging a pendulum will be affected invariably through timing uncertainties in addition to variations through air densities, temperature variations impacting mechanical friction etc.. Other errors may be random in nature and these can often be mitigated in final results by taking multiple measurements.

Also important to interpreting the impact of experimental errors is understanding the difference between “accuracy” and “precision”. One can be extremely precise but have no accuracy whatsoever. Conversely, one can be consistently accurate without great precision. Following is an image which utilizing darts illustrates the differences between accuracy and precision.

As an example of how experimental errors can impact accuracy and precision, let's look at example (b) in this dart illustration. The darts have been thrown quite precise – they are very close together, however they are far right of the bullseye. A good scientist would immediately suspect a systematic error rather than random errors as cause for such. The errors are repeating and in the same direction each time. Perhaps there is a strong steady wind blowing from left to right that is not being accounted for. Perhaps there is a vision problem or throwing motion impediment involving the person tossing the darts.

Let's look at another example from the image – example (a). This wide grouping tends towards the random, but 3 darts is not a lot of data to make that call. A good scientist is going to say - “let's see some more darts hit the board and then we'll decide.

One important thing to take from this is that after only one dart, there is simply no way to learn anything from that one dart regarding what type of error might be influencing the result. Remember this particular point because it's key to Libby's statement regarding experimental error that is twisted by Batchelor. But before I get to that, I need to give a bit of a quick history timeline of the development of radiocarbon dating.

Though he conceived of the idea of radiocarbon dating in the early 1940s, Libby didn't publish his hypothesis until 1946 in Science Magazine. Soon after, his practical and experimental efforts began and continued until 1960 when he was awarded the Nobel Prize for the discovery.  The effort to improve the science of radiocarbon dating has not abated since. Near 15 years was spent by Libby first improving testing equipment to make the measurement process practical followed by vigorously testing and refining the assumptions of the hypothesis. Notice those last 8 words – “testing and refining the assumptions of the hypothesis”.

Contrary to Batchelor's tone, soon after the 1946 publishing of his first paper, Libby with his team immediately began to tirelessly attack and test his hypothesis. The two key assumptions in the hypothesis were the half life of the isotope itself and the historical ratio of C14 in the atmosphere.

Very little testing had been done regarding the half life of this isotope and at the time it was thought by some to be as low as 3,000 years as by others as much as 25,000 or so. Within a couple of years the half life number 5568(+/- 30 years) had been established. Technological improvements soon updated this number to 5730+/- 40, a change of about 3%. That later number has held to this day though there is undoubtedly the opportunity for an even finer point to be put on the number as technologies continue to improve.

From the conception of the method it was known that valid readings depended on the ratios of carbon in the atmosphere and as soon as the team had developed sufficiently accurate ways to measure carbon in the aged samples and had determined a half life period, they immediately began testing the initial hypothesis assumption that the ratio was stable. Let's remember that it's now only late 1949 – only 3 years into the near 15 year initial development process and the team is already hard at work attempting to confirm or deny this important assumption of the hypothesis. Batchelor would have you believe that Libby was obstinate, stubborn and unmoving on the point of atmospheric equilibrium whereas in truth, to Libby atmospheric ratios were just another assumption on the list that must be tested and tested until assumptions gave way to known values.

How can you possibly devise a test to determine atmospheric ratios in the past? There are a number of ways (especially in 2015), but in the late '40s, it was still quite simple – test date objects of known age. If the result returned is wildly off from reality, then you know the ratios have varied widely in the past. If the results are close, you know the variations have been small. If the results are really close (and this is very, very important to the principles of experimental error), if the results are really close – within the bounds of the error stack of the entire experiment, you might not even be able to tell if the variation is caused by a lack of equilibrium in the atmosphere or some other error in the stack. More importantly, you may not need to know which error is dominating as your entire result is close enough for what you are attempting to achieve with the experiment.

This is where Libby found himself in 1949. With the cooperation of the archaeologists and anthropologists - who until this time held all the scientific dating knowledge in the world, he gathered together samples from key objects in recorded history and tested them. The result just to the left here was a small chart with a big impact – it is historically remembered as “The curve of the knowns.”

What you see in the chart is a slightly curved line predicted by the method. The points on the chart represent the actual ages of the various objects accompanied by the error ranges of those dates. The closer the points are to the line, the more accurate the dating method was in this test. Most of the group were within 3-5% of prediction with one outlier of ~8%. Each one fell within the +/- 10% that was the calculated experimental error for the test.

Years later in his Nobel presentation, here is what Libby said about the 1949 test .which greatly confirmed the two primary assumptions of the hypothosis (the half-life assumption and the equilibrium assumption)
“[If the assumptions were correct], we should expect to find that a body 5,600 years old would be one-half as radioactive as a present-day living organism. This appears to be true. Measurements of old artifacts of historically known age have shown this to be so within the experimental errors of measurement.”
And we just got to the meat of the issue regarding Batchelor's accusation. See the use of the term “experimental error by Libby? Batchelor wants to say “no you didn't confirm equilibrium you stubborn scientist, just look at those dots that are not on that line.”, while Libby rightly sees his confirmation within the calculated and expected bounds of his error stack. Libby didn't expect perfection, he just hoped the results would be within 10% of the known ages and when they were, for the moment he was a happy man.

So even now after these tests did Libby and the scientists that followed blindly assume that the atmosphere was in equilibrium? Nope - remember the “you can't learn much from one dart” note that I earlier asked you to keep handy? That principle is what drove Libby and the team to throw more darts ... lots of darts.

Continuing on with the testing program, they went right forwards working to test this assumption more thoroughly. When the darts began to fall in patterns, it became clear that there were certain but small variations in historical atmospheric C14/C12 ratios. Working with Dr. Hans Suiss and following extensive testing of more objects of known age, a calibration curve was eventually published (at left) showing the changes in C14 concentrations in the atmosphere over the last couple thousand years or so.

Scientists have continued to extend and confirm this calibration curve through dendrochronology, sediment varves, coral reefs and other studies. The curves now over lap with ice core calibrated curves for other radiometric dating methods which help extend anchored calibration back more than 50,000 years of actual reference measurements.

To the left is the latest (IntCal13) calibration curve used internationally for carbon dating – it goes back 50,000 years which for all practical purposes is the limits of carbon dating. Remember that this calibration curve isn't wild speculation, but created by measuring countless objects whose ages has been determined through a myriad of other independent methods.

Notice a couple of interesting points about this chart:

1: the variations in atmospheric C14 have been relatively small over the last 50,000 years and remarkably close to Libby's original hypothesis (the diagonal line).

2: the actual measurements of objects of known age have shown over the last 50,000 years to be rather consistently older than Libby's method predicts – shown by the calibration line being below the diagonal line.

In other words, the fact that the earth's atmosphere varies in C14 concentrations has almost invariably worked in a direction favorable to Batchelor's YEC assertions. Libby's method wasn't dating things and claiming things were older than they were – turns out the method was dating things and claiming they were younger than they actually were.

To summarize this installment: In science, assumptions do not mean obstinance nor do experimental errors mean mistakes. Libby merely started with standard hypothetical assumptions and worked for years to increase the accuracy of the method by testing and correcting those initial assumptions. One only has to look at the tens of thousands of individual measurements of objects of known age to see that Libby was pretty darn close to perfect even on his initial assumptions.

We now far better understand the causes of these variations (the variations in the magnetic field of the earth is one major cause) but even if we weren't aware of the variations or their causes the dating method would still be reasonably close considering the ages we are dealing with. What we have now however really makes all of that equilibrium talk moot -- we have a calibration curve anchored to countless thousands of measurements of known age going back tens of thousands of years.

To summarize that summary, Batchelor needs to learn more about science before he talks about science.

------------------------------------------------

For those interested in reading the documentation for the Libby story told above, following are links.

Link 1
Link 2
Link 3
Link 4

1 comment:

  1. Scientism is destroying this world. One true God, the earth is flat, and nephilim once roamed the earth. The evidence is in the earth, not this explanation with out answers. People are rising up the truth of the word of God is spreading. Keep the ten commandments, accept Christ into your life, and you will be accepted into the kingdom of heaven.

    ReplyDelete