What is Standard Deviation?
Standard Deviation, or SD, is an estimate of the variation within a data set. In other words, it's an estimate, just one way to determine, the spread, or cluster, around your average.
Why would I use it?
Let's say I want to know something about a population of fish in a lake. Let's say I want to know ranges of sizes in length of those fish. There is no way that I can catch all the fish and measure all of them so I need to have some tools to make estimates. The first thing that I have to assume about the population is that the length of the fish follow a "normal distribution", in other words, a "bell shaped curve".
This curve tells us, first, that the population has an "average length", right there at the peak of the curve, at the center of the distribution, which means that I would expect the average to lie at the center of the distribution, with fewer individuals towards the extreme ends, and an equal percentage being distributed above and below that average. Hence the distribution is normal, or bell shaped around the average.
The next thing I need to assume is that the fish I catch and measure are a "random sample" from the population. In other words, I didn't catch just the larger fish or the smaller fish, but a random sample of small, medium, and large fish.
So, I catch 20 fish with an average length of 12 inches, and from that I calculate an SD of 3 inches. Now the calculation itself is not for the faint of heart. Luckily, a common program like Microsoft Excel can do it for you. Many electronic BG logs will do it for you also.
So, what do those numbers tell me? They tell me that I would expect 68.26% of my fish to be between 9 and 15 inches long, 95.44% of my fish to be between 6 and 18 inches long, and 99.74% of my fish to be between 3 and 21 inches long. What does that mean? I don't know. Tell me something else about the fish and why the length is important to know.
An important point to make here is that if I could catch all the fish in the lake, I wouldn't need to estimate any of that. I would already know, from my measurements what the biggest and smallest fish were. I only need to estimate the SD because I can't measure all the fish
.
Standard Deviation and Blood Glucose
With BG's, there are two obvious problems with using SD to estimate variation.
First, your BG's over the course of the day, unlike fish in a lake, are not normally distributed around the mean. Second, the BG samples you take during the day, unlike fish you catch in a lake, are not random. Blood Glucose, especially for diabetics, is highly skewed, and most of us take our BG readings at pre determined times every day that have everything to do with the timing around which we eat our meals. Hardly random samples.
Let's ignore those important assumptions though, and go ahead and calculate the SD for a typical day from a diabetic with reasonable numbers. You measure before and after each meal, and maybe once before you go to bed. Premeal, your BG is 80 and let's say you catch your peak BG at 120 one hour later. You do a second post-meal reading two hours after you eat and find that your BG drops back down to 80. Let's say you are lucky enough have that happen for all three meals and to go to bed at 80.
That's 10 measurements with an average BG of 100 and an SD of 19.32. So, you would expect 99.74% of your BG readings to fall between 157.96 and 22.04?
Hmm. Did you actually have a spike close to 160 or a hypo down to 22? No, but according to the estimate of SD, that's what you would expect. In fact, 100% of your readings fall between a high of 120 and a low of 80 that day. The question is, is there any reason to believe that your range of BGs for that day was different than what you actually measured?
That's for you to answer by, maybe, testing more often at different times of the day, but would you do that because your SD tells you that you should expect a greater range, that is, higher highs and lower lows?
Of course not, but let's say you go ahead and test one more time at 3:00 am. Bam, 160! Wow, what a great tool! It actually predicts what my high was for the ENTIRE 24 hours without me having to measure it at all!!
Umm, not quite. Punch that 160 into your calculation for the day and your SD goes up to 27.5. Again, is there any reason to suspect that you had a higher high and a correspondingly lower low?
Umm, no.
So, you rinse, repeat, the stars align for you and you are lucky enough to have the same exact numbers for 10 straight days, or 100 BG readings. Your high and low do not change, nor does your average BG, but your SD goes down to 18.42. Did your control get tighter over those 10 days? No, you just took more samples. You would have had the same result if you had taken 100 samples in one day with the same numbers.
The point is, SD, by definition, is already highly dependent upon the range of BGs you measure and on the number of times you measure a day, but it will always be just an estmiate with built in assumptions for how the rest of your measurments should go.
If you are diligent about testing your BG, you will have already logged your highest high and your lowest low, how many highs you have a day, how many hypos, and more importantly, the conditions and situations that lead to highs and lows.
SD cannot tell you any of that. The only way to know for sure about the variation around your average BG is to measure it for yourself.
The real value of Standard Deviation
Let's go back to that population of fish in the lake. I have a profile of the length for the population from my measurements. I know that the state wants to stock the lake with another species of fish so I want to know if the population of native fish will be affected. I take measurements the next season and find that the average size of the fish drops to 10 inches but the SD goes up to 4 inches.
Now, I can do more analysis and determine something about my population of fish and how it's being affected by the introduced species.
The value of variation around the average BG, not SD, for a diabetic
Ultimately, we want to know something about our BG control and how our control is affecting our chances of being stricken with all those horrible complications. We know that Hb A1c tells us something about our chances of getting complications because it tells us something about our average BG over an extended period. We want our average BG to be within the normal range for as long as possible, period.
We are also finding out that average BG is not enough. We need to know what our highs and lows are. More importantly, we need to know under what circumstances we are getting highs and lows, how long we are spending above normal, and how high we are actually going. We need to know what the variation is around the average, definitely.
As you can see, however, SD can give you no more information about how tight your control is than what you can obtain by being diligent about taking your BGs. In fact, SD can be highly misleading, which means that you cannot compare SD between individuals, because we all take different numbers of readings, at different times, and may have additional variables throughout the day that affect our individual ranges. Let's be real. If you're spiking to 250 because you miscalculated a carb number, or had a post exercise high, it will affect your SD, obviously. But do those transient numbers really affect your chances of getting complications? Possibly, but your SD adds nothing to what you already know.
So, if I wave my 9.6 SD at you and your 19.32 SD, the first thing you should ask me is how many times I'm measuring a day. The second thing you should ask me is did I actually record my highest high and lowest low?
More importantly, you should ask me why it even matters.