# Why Standard Deviation of Blood Glucose can be a meaningless number for diabetics:

What is Standard Deviation?

Standard Deviation, or SD, is an estimate of the variation within a data set. In other words, it's an estimate, just one way to determine, the spread, or cluster, around your average.

Why would I use it?

Let's say I want to know something about a population of fish in a lake. Let's say I want to know ranges of sizes in length of those fish. There is no way that I can catch all the fish and measure all of them so I need to have some tools to make estimates. The first thing that I have to assume about the population is that the length of the fish follow a "normal distribution", in other words, a "bell shaped curve".

Image375.gif

This curve tells us, first, that the population has an "average length", right there at the peak of the curve, at the center of the distribution, which means that I would expect the average to lie at the center of the distribution, with fewer individuals towards the extreme ends, and an equal percentage being distributed above and below that average. Hence the distribution is normal, or bell shaped around the average.

The next thing I need to assume is that the fish I catch and measure are a "random sample" from the population. In other words, I didn't catch just the larger fish or the smaller fish, but a random sample of small, medium, and large fish.
So, I catch 20 fish with an average length of 12 inches, and from that I calculate an SD of 3 inches. Now the calculation itself is not for the faint of heart. Luckily, a common program like Microsoft Excel can do it for you. Many electronic BG logs will do it for you also.

So, what do those numbers tell me? They tell me that I would expect 68.26% of my fish to be between 9 and 15 inches long, 95.44% of my fish to be between 6 and 18 inches long, and 99.74% of my fish to be between 3 and 21 inches long. What does that mean? I don't know. Tell me something else about the fish and why the length is important to know.
An important point to make here is that if I could catch all the fish in the lake, I wouldn't need to estimate any of that. I would already know, from my measurements what the biggest and smallest fish were. I only need to estimate the SD because I can't measure all the fish
.
Standard Deviation and Blood Glucose

With BG's, there are two obvious problems with using SD to estimate variation.

First, your BG's over the course of the day, unlike fish in a lake, are not normally distributed around the mean. Second, the BG samples you take during the day, unlike fish you catch in a lake, are not random. Blood Glucose, especially for diabetics, is highly skewed, and most of us take our BG readings at pre determined times every day that have everything to do with the timing around which we eat our meals. Hardly random samples.

Let's ignore those important assumptions though, and go ahead and calculate the SD for a typical day from a diabetic with reasonable numbers. You measure before and after each meal, and maybe once before you go to bed. Premeal, your BG is 80 and let's say you catch your peak BG at 120 one hour later. You do a second post-meal reading two hours after you eat and find that your BG drops back down to 80. Let's say you are lucky enough have that happen for all three meals and to go to bed at 80.

That's 10 measurements with an average BG of 100 and an SD of 19.32. So, you would expect 99.74% of your BG readings to fall between 157.96 and 22.04?

Hmm. Did you actually have a spike close to 160 or a hypo down to 22? No, but according to the estimate of SD, that's what you would expect. In fact, 100% of your readings fall between a high of 120 and a low of 80 that day. The question is, is there any reason to believe that your range of BGs for that day was different than what you actually measured?
That's for you to answer by, maybe, testing more often at different times of the day, but would you do that because your SD tells you that you should expect a greater range, that is, higher highs and lower lows?

Of course not, but let's say you go ahead and test one more time at 3:00 am. Bam, 160! Wow, what a great tool! It actually predicts what my high was for the ENTIRE 24 hours without me having to measure it at all!!

Umm, not quite. Punch that 160 into your calculation for the day and your SD goes up to 27.5. Again, is there any reason to suspect that you had a higher high and a correspondingly lower low?

Umm, no.

So, you rinse, repeat, the stars align for you and you are lucky enough to have the same exact numbers for 10 straight days, or 100 BG readings. Your high and low do not change, nor does your average BG, but your SD goes down to 18.42. Did your control get tighter over those 10 days? No, you just took more samples. You would have had the same result if you had taken 100 samples in one day with the same numbers.

The point is, SD, by definition, is already highly dependent upon the range of BGs you measure and on the number of times you measure a day, but it will always be just an estmiate with built in assumptions for how the rest of your measurments should go.

If you are diligent about testing your BG, you will have already logged your highest high and your lowest low, how many highs you have a day, how many hypos, and more importantly, the conditions and situations that lead to highs and lows.

SD cannot tell you any of that. The only way to know for sure about the variation around your average BG is to measure it for yourself.

The real value of Standard Deviation

Let's go back to that population of fish in the lake. I have a profile of the length for the population from my measurements. I know that the state wants to stock the lake with another species of fish so I want to know if the population of native fish will be affected. I take measurements the next season and find that the average size of the fish drops to 10 inches but the SD goes up to 4 inches.

Now, I can do more analysis and determine something about my population of fish and how it's being affected by the introduced species.

The value of variation around the average BG, not SD, for a diabetic

Ultimately, we want to know something about our BG control and how our control is affecting our chances of being stricken with all those horrible complications. We know that Hb A1c tells us something about our chances of getting complications because it tells us something about our average BG over an extended period. We want our average BG to be within the normal range for as long as possible, period.

We are also finding out that average BG is not enough. We need to know what our highs and lows are. More importantly, we need to know under what circumstances we are getting highs and lows, how long we are spending above normal, and how high we are actually going. We need to know what the variation is around the average, definitely.

As you can see, however, SD can give you no more information about how tight your control is than what you can obtain by being diligent about taking your BGs. In fact, SD can be highly misleading, which means that you cannot compare SD between individuals, because we all take different numbers of readings, at different times, and may have additional variables throughout the day that affect our individual ranges. Let's be real. If you're spiking to 250 because you miscalculated a carb number, or had a post exercise high, it will affect your SD, obviously. But do those transient numbers really affect your chances of getting complications? Possibly, but your SD adds nothing to what you already know.

So, if I wave my 9.6 SD at you and your 19.32 SD, the first thing you should ask me is how many times I'm measuring a day. The second thing you should ask me is did I actually record my highest high and lowest low?

More importantly, you should ask me why it even matters.

Bah, correction..

The average BG is 92 which gives you a high of roughly 150 and a low of roughly 35. Punching 150 into the BG readings, of course, raises your average BG slightly to 92.5. SD goes to 25.

Feel free to check my calcs, but hopefully, you get the point of SDs dependency on assumptions.

I agree with you, and think of the importance of the time variable (usually not considered): 8 measures during the day weight more than 3 measures during 12 hours at night, according to this methods.
You measure when you feel bad, not when you feel good, and so on.
But we need a "short" number to benchmark our (personal) performance, detect a trend and improve it.
We have too many data to manage and sometimes we think we improved control with respect to the past when actually it's not.

Would your opinion change if the SD was provided by a CGM?

Not really Jim because it doesn't change the underlying assumptions that need to be made, the most important of which is that your BG readings follow a normal distribution.

Assuming that your CGM (I have a Dexcom btw, which i absolutely love) is accurate, what it does do for you is give you many many more samples of BG to compare and a much better opportunity to determine what you actual high for the day is and what your actual low for the day is. More importantly, what it really does, as garidan alludes to, is show you how long you're spending at any given BG. I'm not going to argue that a spike to 250 is not damaging but I will argue that 2 hours spent at 250 is more damaging than 30 minutes spent at 250. SD doesn't give you that important distinction.

I do think that any comparison between data from accurate CGMs, including SD, are going to be more useful than comparisons between numbers generated from BG meter readings because you can be a little bit more sure that the number and timing of the samples are comparable.

I still don't see what extra information you are getting from the SD that the raw numbers themselves aren't going to tell you.

I would think that SD would be more informative with more readings? I look at it as a measure of how wide the swings are? If I look at my older reports, it was in the higher 30s and now its in the higher 20s, not a huge amount of progress but, in general, I *know* my BG is more stable these days and that's substantiated by the SD. The A1C has been 5.4/ 5.6 (I had a 5.0 "outlier" last spring...) instead of 5.8 when the SD was higher. guess it's not a big suprise but people are often suggesting that A1C shouldn't matter for a variety of reasons and I don't agree with that either. Given the vast number of numbers that we deal with, I think any additional numbers that help refine the picture can be useful. Or at least make you think "there's something else to look at". I am not a perhaps a good example b/c I am always trying to "win" every test or at least have a plan in place to anticipate problems and keep the lid on them.

Also, for "math challenged" people like me (not really but I don't bother with it...) I am *only* going to get any numbers out of my computer/ meter/ pump. I haven't kept a "log" regularly since I was in high school (1984...) except for a month to get a pump. I also may squeeze in a few extra tests, more like 14/ day, which might provide more data to make the sample size larger, on top of what the CGM spits out? I'm not 100% sure if the SD number in the CGM comes from the CGM readings or meter readings. I suppose that there could be a "skew" because of the tactical utility of readings when you are high or low and that you might end up with more readings around your out of range type of numbers?

I was gonna say too, thanks for the informative post!

Ok that makes sense. I guess with CGM I like to use average in combination with sd since average can hide the fact that I may be spiking around more than I want. The example I would use is I may have an a1c of 6.0, but that may hide the fact that I have alot of 250 and 40 bg’s. Where in this case my sd would be high,but my average low. I just think both numbers are good when used in conjunction with each other.

Thanks for the informative blog. I enjoyed.

Thanks guys,

For acid, first, I'd say that a wrench is going to be a bad tool for pounding a nail into wood no matter how many times and how hard you whack the nail. Not saying it doesn't "work', but, hey.

For acid and Jim, what we are looking for is the variation in our BGs around our average, or how far, and hopefully, how many times we fall outside of a healthy, non-complication inducing, range of BG. Thats very important to know, accurately and precisely.

If you look at the graph, you can see that "one standard deviation" or "two standard deviations" have very specific meanings for the number of readings you should find a certain distance from the mean. It also has a very specific meaning in terms of how far away from the average, in both directions, your readings should go. The problem is, as you can see, what the SD says your BG profile should like like may not be accurate at all.

Just because the vast majority of people who refer to SD for BG's don't use it that way, doesn't mean that it's not supposed to be used that way.

I'm not surprised that there is a correlation between SD and A1c because, after all, you're using your numbers to generate the SD and if your average is smaller with tighter numbers, your SD, as a number, will respond accordingly. The problem is, first, what does the "number" mean? Second, you could be having better control, overall, or worse, and not seeing the same response from your SD because of a few highs, or lows that affect your SD without really affecting your control.

In the end, if you are like Jim and you know you are having 250 spikes and 40 crashes to go along with your A1c of 6.0, I don't think your really need an extra number like SD to tell you that you need to work on thos highs and lows.

Hey, in the the end if you are paying attention to your numbers, making adjustments, and finding your SD useful, I think thats awesome. I actually pay close attention to my SD along with my highs, lows, A1C, hopefully time spent above and below normal range from my Dex when it's generating good info...

I'm a firm believer that If you need to pound a nail and all you have is a wrench, go for it. I'd be more cautious about worrying about what any particular SD number means if someone tries to tell me i need to decrease my SD without actually looking at the rest of my numbers.

I'm wondering two things:

You say in your post that BGs do not fall into a normal curve. Which is true. Yet you are trying to make the SD "fit" a normal curve. Can you not have a SD fit a skewed curve (towards high or low) and still be accurate? (I've taken a grand total of one statistics course, so I really am wondering!)

I would agree with you that for a day of readings SD is pretty useless. However, I think the point of SD is that it's supposed to be a "long-term" measure similar to A1c, not necessarily a number you look at daily (I would also argue that looking at an average for just one day isn't really useful). So while you may not get a 150 as a high in a day or two of readings, in several months of readings you probably would. Does this make it more useful?

I also think you are thinking of SD in the sense of inferential statistics, when it's really meant to be used as a descriptive statistics. It's meant to keep track of variability rather than predict variability. I'm also not sure it's really that connected to statistics at all—I've read several places that SD should be less than 30 mg/dl and/or less than 1/3rd of the BG average. In that sense, distributions don't even really come into play, it's just a convenient way of summarizing things (maybe it also needs a new name?).

Hi Jen, thanks for the comments.

Couple things.

First, can SD be made to fit a skewed curve?

No. By definition, it's parametric which means it's only applicable to a normal distribution. There are non-parametric measures of variation that do not depend upon the distribution.

Also, even though SD is descriptive, by definition, when you generate a Standard Deviation you must assume a normal distribution which means you are already saying something mathematical about your expectations for what the range of BGs ought to be. Look at the graph again. The SD is an expression of the percentages of the data set that should fall within a certain distance from the average. Don't call it inference then, but, by definition, your SD is telling you where your numbers should fall.

I picked the data set (80 - 120) and one days worth of data to demonstrate your point, exactly, that SD, by mathematical definition, is highly dependent on the number of samples and is probably useless for saying anything about a days worth of data. Is it useful for longer periods of time? That, again, doesn't depend at all on how many days of data you have, just the total number of data points. You can blow your SD to hell just by taking 20 measurements on a very bad day versus five on a very good day. I'll argue again, its a moot point unless more days means you're generating a normal distribution.

Let me illustrate by using the often used example of a SD that should be 1/3 of the average BG (or 30 mg/dl). First, let me ask you that you think that number actually means, and why do you feel it's important in summing up your BG? More importantly, what does it say about your probability of having complications with an SD outside of 1/3 your average?

By using "Standard Deviation", here's what it means, by definition and for all intents and purposes when you generated the number using the formula used to calculate Standard Deviation.

For an average BG of 90. At 1/3 the average, you should have an SD of no greater than 30, which means that. You should expect 68.3% of your readings to fall between 120 mg/dl and 60 mg/dl, or one standard deviation to either side of the average. Not bad I suppose. But also, you can expect 99.7% of your BG readings to fall between 180, NEAR the top of your range and 0 NEAR the bottom of your range, or 3 standard deviations to either side of your average.

OOOkay.

I'm not trying to be flippant or a A-hole, but if that's not what you meant to say about your BGs then you shouldnt use Standard Deviation to try to describe variation around the mean of your BGs. It may be a convenient way of summing things up, but i'l lsay it again, that doesn't make it accurate or in any way meaningful.

For me, a better, maybe not more convenient, way to sum up my data is to actually take random readings during the day, try very hard to measure all of my spikes, and all of my lows, keep track of how many i'm having highs and lows, and take the appropriate steps.

Like I said, I do track my SD because it's something that's generated for me every time I log my BGs on my phone app. I, of course, do find some corralation between highs and lows and my SD. My goal, however, is to manage my highs and lows, not my SD. There is a difference.

Thanks again!

Thanks for the detailed explanations! What you say makes sense. I actually don't use SD because the software I use doesn't generate it (although my pump software does, but I don't download my pump daily)—and also because I have never been able to meet the 30 mg/dl or less than 1/3rd guideline, anyway. I tend to go by my highest/lowest reading in a week or month as well as the percentage of readings within certain ranges (using a histogram or pie chart). I wonder if the problems you outlined are why SD hasn't really caught on as a method of tracking control.

While the SD may not be a perfect measure of the variation of blood sugar, it is about the best we have. It is proportional to the variation, the larger the swings, the higher the SD and vice versa. If you want to see whether you have reduced your variations, you can measure you SD and see.

Is it subject to sampling bias (because you only test at certain times)? Yes. But all of us actually sample at the times we are "most likely" to see higher variations. So by definition, the sampling bias results in an overestimation of the variation.

I agree, you can't use the SD to "predict" the chances of a serious hypo. You can use SD to "explain" how variations place you at risk for hypos.

To me, the biggest use of the SD is to help me guide when I have gotten better or worse at controlling blood sugar swings. The SD works very well as a measure of my blood sugar variations and I continue to use it effectively.

I wonder if the problems you outlined are why SD hasn't really caught on as a method of tracking control.

I would hope so Jen. I have practical knowledge of statistics from grad school research, but I am by no means a statistician. What I'm talking about is Stats 101. I would hope someone out there with a deeper understanding is better able to discuss these issues, or, at the very least, better able to explain why such a poor tool for the job is being used.

Hi bsc,

My main issue, best available measure or not, is that I'm hearing people pushing for SD to be some kind of "industry standard" for measuring variation in diabetics.

That's a mistake.

Mathematically, SD has a definition that makes it's application in the primary literature easily understood. Those results are, ostensibly, used by our health professionals to help us manage our diabetes, and more importantly, to determine policy. Because our BGs do not follow the assumptions necessary to accurately and precisely determine a mathematically meaningful SD, it's the wrong tool to use as a "standardized" measure of BG variation. Whether you believe an SD of 33% your average is meaningful to you or not, as far as even a descriptive measure of your BG profile goes, it doesn't represent a mathematically meaningful "Standard Deviation".

Furthermore, there's no way you can predict whether your Sd will be an underestimation or an over estimation of variabilty. You may be expecting a low of 80 before a meal, but if you might get a 150 instead. If you are only testing before a meal and 2 hours after a meal, you might be missing your peak BG. Those biases would result in a underestimation of variability As a T1, assuming no hypo-insensitivity, I might be more likely to detect lows than highs, which means I'm more likely to test when I'm low than wen I'm high. Can you, as a T2, reliably "feel" a low or a high?

Mathematically speaking, it's silly to even be talking about a "biased Standard Deviation". If you want a meaningful SD, you absolutely require completey randomized measurements.

Highly variable BGs place you at risk for both highs and lows, by definition. I agree that, generally, if you can generate lower SDs, that "probably" means you're at a lower risk for having hypos, and spikes. That, howver, doesn't tell you "how" variations place you at risk for hypos.

That still assumes that you are, at the very least, taking fairly random measurments. You may have a "high" SD and assume that you are at a higher risk of having a hypo, but if your higher variation is being driven by BG spikes, you've just drawn the wrong conclusion.

I agree that, if you are paying attention to your BGs trends, number of highs and lows, how many times you are testing and when, etc etc ect, your SD can have some kind of meaning for you in terms of BG variabililty. However, because of the tendency towards unpredictable bias, your number may have absolutely no relationship to mine.

Thanks again bsc.

I don't get 150s before meals! ;-)

I think I've only seen a few people lobbying for the SD in lieu of A1C or saying it's better somehow and I think I usually try to disagree with that suggestion. If I had to pick, I'd pick 1) BG 2) A1C (an average of BG anyway) and 3) SD would be way in back. I think that the high SD would place you at a higher risk of having a hypo if you do what you are supposed to and fix it. I am not sure what people who don't hang out on message boards do about that stuff though and, from occasional visitors/ new members reporting higher A1Cs (not that it's entirely their fault, I think that the medical system fails to inculcate people with an appropriate sense of OCDiabetes that may be part of the way to achieve ok results...) end up either abandoning attempts to achieve control very much at all or perhaps get engaged in "rollercoastering" as BGs fly up and down. I've done a bit of the latter, although not for a while, and it can be hair raising.

Maybe SD is more useful as a "fine-tuning" number than as a "goal" or "standard"? Doctors don't seem too interested but every now and then one will blurt something out about it. Although since I gave my doctor the "keys to the kingdom" (CareLink ID...), she mostly reports "all your numbers are good" and I'd rather worry about making them better myself?

Thankfully, I haven't seen anybody who's that far out on the fringe, at least on this forum. If I did see that argument somewhere in the forums, I'd probably just back slowly away.

No, but I have seen the mention of SD pop up enough on the forums to want to post this blog. Nothing against people who are proponents but I thought it was worthwhile to post a counterpoint.

Like I said, if people are finding SD useful, in some way, great! I just think that the mesage that it's not that great of a tool for comparisons between diabetics, is kinda lagging behind. Certainly, if my endo wanted to talk to me about my SD, we'd have to have a loooong discussion.

Here's a bunch of threads. Maybe nobody quite says "instead of A1C" but I have a vague recollection that Natalie's mentioned that her A1Cs don't read correctly, making the number less useful for her or something like that? I had no idea we had so many people from South Dakota....

https://forum.tudiabetes.org/topics/effectiveness-of-insulin-pump-therapy?commentId=583967%3AComment%3A2503415

http://www.tudiabetes.org/group/dexcomusers/forum/topics/what-is-your-glucose-standard?commentId=583967%3AComment%3A880631&groupId=583967%3AGroup%3A168831