Why Standard Deviation of Blood Glucose can be a meaningless number for diabetics:

I have to tell you. This stuff is all confused. First, the relationship between average blood sugar and A1c of a "population of individuals" is a "normal" distribution. The SD makes sense in describing the variability of A1c values for a given average blood sugar. Figures quoted about the variation of the A1c being +/- 20% pertain to the population statistics. However for a single individual, it behaves entirely differently. An individual may be a high or low glycator. In those instances, the A1c probably follows a normal distribution across a "population of tests" but the variation is more like 5-7% (see the NGSP).

FHS is of course right, SD is only one measure of variation (sometimes called dispersion). And blood sugar readings are not a "normal" distribution for a number of reasons. First, we are more aggressive about treating lows than we are about highs, so the distribution is lopsided. And there are other subtle things that mess with the distribution, we tend to "correct" highs and lows toward the center.

But the distribution is still very close to normal (at least in my case). Below is the graph of my three primary tests (morning fasting, pre dinner and post dinner) over the last year (a total of more than 1000 blood tests.


If I compute my average and SD (assuming a normal distribution), then I would expect that 68.2% of my readings would be in 85.1 - 127.7 mg/dl. When I look at my readings, I find that 69.1% of my readings fall within that range. If I look at 2 and 3 SDs, I get a similar match, 95.4% versus 95.7% and 99.7% versus 99.4%.

My conclusion? A normal distribution is in fact a very close measure of the variation in my blood sugars and the SD is in fact quite accurate in making estimates of the chances of my blood sugars being in certain ranges.

Hi bsc,

Nice! Thanks for sharing your data here. Much appreciated!

Obviously what you did is meaningful for you but I have some questions given these data. Hopefully, these don't come across as criticism. I just want to clarify the important point of how meaningful Standard Deviation is for describing variation in BG.

The first thing we have to clarify is that we are looking at your "sample of data" that approximates a normal distribution, Remember, our data set is assumed to be a random sample of our entire "population". It must be a random sample. In our case, it should be a random sample of the "population", which is our entire BG profile (remember our old discussion that BGs aren't even discreet).

If our BGs over an enitre day, or entire week, or entire year are not normally distrubuted, then what our data set looks like does not matter. In fact, our sample set becomes misrepresentative of what is actually happening to us if it shows a normal distribution but our actual entire "population" of BGs is not normally distributed, especially if our sample set is not random.We end up getting a misrepresentation of what ius actually happening to us.

So, the question is, were your 3 samples a day truly random and do 3 samples a day really represent whats happening over the entire 24 hours? More importantly for calculating SD, did you omit data?

Regardless of whether the data are normally distributed or not, why would you even need to calculate a standard deviation when you already have what you need? As I said in my post, you would calculate an SD to help describe your data set when you can't collect every sample. You collected the samples that are meaningful to you, so you used a much simpler and straightforward calculation using percentage of your actual data points over a given range to determine, with more accuracy and precision, the same exact thing that SD can only approximate.

Are the 85.1 - 127.7 or any other ranges determined by the standard deviations even meaningful to you?If you went through the trouble doublechecking the ranges determined by SD, why not do the same exact with, say, a range of 70 to 120, representing the percentage of points that are within normal range? Or, percentage of points above 120? Or, any range of numbers that have more meaning for a diabetic, other than a mathematically generated number like a standard deviation which may not have any meaning at all?

Finally, and really the important point. I'm not trying to be pendantic, or obtuse, or an A-hole, but I still get why that number (21.3?) should be in any way meaningful to me.

Thanks again bsc!

How did you create that chart?

I've always seen a lower SD as an indicator of fewer spikes/drops, or less variability. When my SD is higher, I do see more variability - when lower, less.

I think this is an important point worth restating. Since Standard Deviation is a measure of dispersion, as bsc says, it absolutely can, implicitly, tell you something about your variability that can help you decrease your variability.

Standard Deviation, however, has an explicit mathametical definition that, I argue, has very little worth to a diabetic for various reasons.

Here's the graph from my post again:


The x axis shows you + or - s.d., Standard Deviation. Once you use your data to calculate Standard Deviation, it assumes you will find a certain percentage of your entire BG profile, for the period of time you took your readings, within a mathematically determined distance (the “Standard” Deviation).

If, hower, your profile or “curve” doesn’t look like the one in the graph, the SD you generate doesn’t approximate anything.

Even, best case scenario, it does as in the example bsc posted, somebody still has to explain to me why I should care about that number. It doesn’t represent percentage of BGs in normal range. It doesn’t represent percentage above or below normal range either. Just do what bsc did to doublecheck SD and pick a range that means something to you.

You want your numbers not to swing a lot because swings have a number of meanings, in terms of your "operation". Your rates/ ratios may not be correct, your carb counting may be off, you may not be considering the impact of exercise or activity (shopping used to mess me up a lot...). Also, if the curve is such that your average includes *very* highs and *very* lows, those particular conditions have particular hazards associated with them?

If your statement is directed towards me acid, I get that. Variation is important, for many reasons.

oops, nope, just the "generic" "you", if "one" were looking to derive some meaning from the number the computer spits out or you or BSC or other more math-oriented people would come up with?

I distinguish between the blood sugar readings for an individual (like myself) and blood sugar readings for an entire member population (like every tudiabetes member). The data I presented is a subset of my testing, I test 4-10 times/day. The data does not have to be random, just uniform. In my case, I tested every day and on an infrequent basis (at random) missed a test. So my data is relatively unbiased. I did not omit any data from the selected sampled population (morning, pre and post dinner). Estimates based on the average and SD of course only apply to predictions about that sampled population. You are correct, it does not apply to other times. But it does show that the underlying variation is fundamentally a uniform random process. This does not suggest anything about someone else, or how pooled blood sugars for say all members of tudiabetes might look.

While we could argue about whether the "rest" of my blood sugars are fundamentally different, my data suggests that fasting numbers and around meals, over the long-term, follow a normal distribution. And as such, the SD is a good measure of variability.

As you note, I could use other measures, such as range or even better interquartile range. Since the actual distribution is so close to normal, these measures give no better insight. So I guess I am confused, I can readily calculate the SD, it is meaningul and there does not appear to be any better summary measure. Why would I use anything besides the SD to measure my variation?

Notes:
The point of the range 85.1 - 127.7 is that it is 1 SD around my average. I am not sure where the 21.3 number came from.

I created the chart with excel.

I think you hit the nail on the head acid.

If a person is not mathematically inclined, we might just want to know a number that we can refer to to help us understand if we are at a higher or lower risk of complications. We have conveniently been given "standard deviation". Furthermore, we have conveniently been given "keep your standard deviation below 33% of your average", ostensibly to keep your risk of complications lower.

I'm not even "mathematically inclined" but I have enough experience with statistics to know that when I'm told that, and I look at my data and see that my average BG is 100 mg/dl with a standard deviation of 33 mg/dl, that guideline just makes no biological sense. Mathematically, it tells me that I should expect over 99% of my BGs to fall between 1 mg/dl and 199 mg/dl. That's like telling me I should expect the sun to rise tomorrow.

Someone who isn't mathematically inclined, or doesn't have the experience with stats, won't know that and will take it on faith that they have calculated something relavent to their diabetes management.

If you are just using standard deviation to get a qualitative "feel" for whether your BG's have a lot of variation or not (in other words, not the explicit mathematical definition of SD), then fine. A "smaller" SD will probably tell you that you have "less" variation than a "bigger" SD.

There are so many caveats attached to the probably, though, that, ironically, you're going to have to be mathematically inclined enough, like bsc, to sort out the distinctions.

The point of the range 85.1 - 127.7 is that it is 1 SD around my average. I am not sure where the 21.3 number came from.

Right.

85.1 - 127.7 is the range of one SD around your average. You did not actually give me your SD or average so I had to estimate what both were from the range you gave me. I estimated your average BG to be 106,4, or right at the center of that range 85.1-127.7. So, to get to 85.1 from 106,4 on the low end, and 127.7 on the high end, I have to go 21.3 mg/dl as the estmated SD from your data set.

The data does not have to be random, just uniform

Respectfully, I think you should doublecheck this. I do believe that the underlying assumption that must be met is that the sample from the "population" must be random.

Rather than argue minutia, though, I think you atcually hit on a valid way to, at the very least, measure SD for BG.

If you use a single variable such as "premeal" for your BGs, you can use all premeal, or, better yet, individual premeals (breakfast, lunch, or dinner). To represent your "population" of BGs, I don't believe there is any reason to assume that treating your BGs one variable at a time won't generate a normal distribution, or violate either assumption. as you have shown. That way, you time, or more importantly, anything that happens during that time can be controlled for. You will have a normal distribution and your samples will essentially be random representatives of what happens at that time of the day.

If you look at your data set, I'm assuming those highs that are giving you that long tail to the right of your average are being generated by the post-meal BGs. You can drop those from this data set, get a much better distribution around the mean for "premeal" BGs, then get more data for "postmeal" BGs as your variable to generate another data set for those, doing the same analysis.

You can then compare those two data sets and get a good idea of what "the meals" as an independent variable are doing to your postmeal BGs.

Good stuff!!

My last argument, though, one that I still don't have an answer for, is why is the standard deviation so important for you in the first place? All the SD is, mathmatically valid or not, is a number that establishes a range around your averahe in which to find a certain percentage of your data points. In this case, one SD around your data points represents a range of 85.1 - 127.7.

What meaning does that range have for you? Why not just establish a more meaningful range like 70 - 90 for premeal BGs, and do the same calculation you did?

Thanks again!

Yes, I had to look back at the spreadsheet, my average over that time was 106.4 mg/dl and the SD was 21.3.

Look, I have a spreadsheet which computes a whole bunch of things. It also computes "compliance," which looks at whether my fasting numbers are 70-110 mg/dl and whether my postprandial is 70-140 mg/dl. It looks at those things, but that information is of limited use. If I want to see a measure of how much variability I have in my control, the SD is a better measure.

Yeah, my iPhone app, before my iPhone died, tracked premeal and postmeal BGs and averages, as well as SDs for those data setsl. It also, of course, kept track of overall average and SD for the entire data set. I probably test between 10 and 20 times a day, so 1000s of data points got buried along with my iPhone. =/

I had to set up a seperate spreadsheet to keep track of the percentage of the data set spent within normal range and target range for pre and postmeal BGs.

I found out that if I focused on keeping the percentage of any out of range numbers down for the data in my spreadsheet, both the overall average and overall SD for my entire data set from my iPhone app went down. The SD for pre and post meal BGs went down too.

Heh, big surpise.

The take home message for me was, keep my out of range numbers, both pre-meal and post-meal to a minimum. The data dumps from my Dex, when I finally got one, confirmed that.

Diff'rent strokes for diff'rent folks.

Thanks again bsc.

Hmm. You got me thinking.
1. Is a HgA1C a SD ((standard deviation, say with our low not being zero cause you would be in a coma))? My Definition is its the amount of sugar that sticks to your blood cell in a 90 day aka 3 month periods, or nearby whenever your endocrinologist's secretary schedules you in the office.
You don't have to answer this one:
2. What kind of math would be more accurate. (headsup: I did 2 1/2 years of Calculus two decades ago, though my use of it is currently zero)
As a LADA type 1, I use my daily numbers (I rarely do over 4 daily, but will if necesary: $ thing) as a way to gauge how much insulin I inject. Its a learned art: Am I sick, how does my body feel (its not voodoo, I can feel the rare hypo coming on). Some days get more, I find my body sometimes has ranges, patterns.

3. Isn't what we are really measuring is how much blood vessel damage (and all its unlovely daughter complications: neuropathy, retinopathy, etc) being done to us, if our blood glucose numbers are over a certain level. Some say 140mm/DL is when damage occurs , (others says 83 mm/DL, 104mm/DL ). Numbers outside the range is really a measure of our accelerated death and aging.
Super Bonus Prayer-shot-in-the-dark question: When we are sick and infected, our blood glucose numbers rise (online queries tell me the body is releasing hormones which raises it, I am still asking around and reading). Don't happen to know if that does the same amount of blood vessel damage (as when you are not ill) , do you?

PS. Its super your A1Cs are in the 5% club. You look really in shape which what I read means exercise helps with diabetes. I am a newbie that got diagnosed this summer and think I would of been in the 5% Club but infection really raised my numbers this last time. Accept my good-natured-envious jealousy

Thanks for your thoughts on this statistical figure. With a bit of stretching of the truth I would argue that the distribution of the blood glucose readings is trending towards normal distribution the better the control actually is. Maybe a bit biased to the left or right but still there is a bell shaped curve sitting on the average.

Like you I would argue that we need much more readings for that and these should be representative for the real blood glucose. The best way would be to take the data of a digital diary (I have something in my mind ;-) and to simulate the blood gluose between the measurements with the information in the diary: carbs will raise the blood glucose. Insulin and physical activity will decrease it. Like a full grown simulator of the blood glucose. This simulated curve would then connect the single dots of our measurements. This would be just a questionable simulation but better than working with the dots.

For the Glucosurfer project we have decided that we at least need to weigh-in the factor time. This is why we connect the measurements with a simple linearization. This way we will have a (linear simplified) blood glucose for every minute of the day. As a result the duration of a blood glucose has some meaning for the average or the standard deviation based on these numbers.

I fully agree that the SD should not be used for comparison between individuals. I also agree that the number itself it not that important. But in our diagrams the SD still is a good visual indicator of variability. For example the following diagram is based on the linearized data. It shows the yellow SD, the red line is the mean of that day, the black lines show the span between the highest and the lowest reading. In my opinion the black span is too unspecific because the factor time is missing. But the yellow area shows the standard deviation around the red average value. First we see some days with better control and then we see days with way to high variability. To me this clear visual feedback is the real benefit of the SD in this context and on this data.


Holger, my non-statistical brain can't decipher this
I also agree that the number itself it not that important. But in our diagrams the SD still is a good visual indicator of variability.
How can it be a good indicator of variability yet not be important, assuming variability is the factor one is looking for?

@jrtpub: I meant that the statistical figure SD just makes sense in comparison with SD numbers of the same individual. This comparison can be helpful on a daily and a monthly level to find a trend. The exact number is less important than the information that for the last two months in a row the individual was able to significantly reduce the SD. Actually we do not have a diagram for monthly analysis of the SD. It would be another interesting addition for trend analysis in the Glucosurfer.

Thanks Holger. I can get monthly/quarterly stats from the Dexcom and Ping data (which often differ, of course), and find that combined with a good look at the data it's really helpful for me.

Thans for the input and for sharing these data Holger. What unit of time do the numbers on the x axis represent?

@FHS: the x axis is time in days. In this diagram you see single days. These days are connected to see the progress. The colors might look odd because we try to compensate a possible color blindness of our users.

Dexcom: Estimated Standard Deviation Measurement

What’s your ESD? Do you feel comfortable with it?

SD has huge value, for me, right now.

I’m trying to get an idea of if my ESD is high and I lost all my ‘high variability’ data from a few months ago when my computer crashed. Its important because I am running a pretty low A1C, 6.1 or so. I suspect that variability has decreased enough to support that A1c, or a slightly higher one, because I feel pretty good. My Doc thinks I’m going to die of hypo. Neither one of us knows, for certain. I’m gonna collect some new data, but am interested in seeing how my current ESD compares with the herds, and with my old, ‘high variability’ ESD. I will be looking at if my SD increased or decreased. That’s really all that’s important.

FHS, its OK to assume that the numbers are random. You kinda always assume that they are random, when your working with statistics, even though the numbers rarely are.
([Central Limit Theorem][1]). The key is to use a large enough data set, so this stuff is always gonna be tough with finger sticks. We collect data and perform statistics on events that aren’t random all the time. It presents complications, but it doesn’t make the statistics useless. Even throwing a coin is not totally random, but we can still use statistics to learn things about the coin flipping event.

So, I tend to use really large, one month to three month data sets to measure SD. I wouldn’t use any data less than 20 days, thats with about ~300 sensor measurements per day. Its the difference between counting ten fish in an aquarium, versus a million fish in Lake Superior. If you count enough fish in a big enough lake, it all comes out in the wash (that’s assumed - that may or may not be 100% true in all cases, but its assumed. You could, for instance, be counting fish at the point where a river flows into the lake and there could be a big trout migration at that time of year, which causes the numbers to get skewed towards a population of 100% trout. But, we assume that doesn’t happen when we make statistical assumptions. We assume that if we see a bunch of goofy data, then we will be able to identify a trout migration. Also, if we collect an infinite amount of data, day after day, week after week, rain or shine, the trout migration will end, and our data from the river will get closer to the actual fish population in the lake.) Does that make sense? Maybe someone else can explain it better.

So, Estimated Standard Deviation becomes important to me when my Doc says, ‘your A1c is too low.’ She means, ‘Your A1c is too low for the amount of variability in your system.’ She thinks my average is so low that I am at high risk of loosing consciousness. There are a few ways to measure if that’s true, quantitatively, but I’m honestly forming the bulk of my opinion on how I feel. I would like to see confirmation in the data. Thats what SD is for. I haven’t collected any new data yet. Also, there are problems with how my Doc is going to make her evaluation because she is going to take her SD measurement off of a three day data set. She is assuming that three days can estimate three months, and correlate with the A1c. I think that’s too small to be meaningful. That’s an aquarium. She is making decisions based on data that might be coming outta the river. I will have to take a trip to the lake, collecting data, and that will take about a month…probably. I hope that will give me a pretty good estimate in the variability that I have been experiencing over a three month period. Although, the estimate could be better if I had a big three month, Lake superior data set. Because my system changes a lot,it might not be possible to get a real big, real representative dataset, but that’s life. I’ll just assume that it is representative enough. Thats the best I’m going to be able to do. I’ve been enjoying a much needed break from looking at this darn data.