Meter Review - Semi-Digression: When Numbers Don't Make Sense

I haven't gotten to the point of running the contaminant tests on the meters -- I'm still trying to figure out accuracy issues with the Advocate Duo, and I'm puzzled by the results I'm getting so far on the parallel test check.

The short version is this:

When I checked my (active) Freestyle Flash last time I was in the doctor's office (June 10th), not only was my reading "in range" with the lab reading, it was spot on. Exactly the same. So, I go on my merry way thinking my meter is accurate.

When the A1c-to-eAG conversion table came out, I found that my 5.8 reading meant my average blood glucose level was 120 -- which, in Type 2, is an actionable level. So I'm thinking, maybe my test timing is off. I added back in some random tests, and except for some foods spiking a bit more than I expected for a bit longer than I expected, I'm not seeing anything really weird-and-out-there...

Until I started testing out that handful of meters.

Now, I understand that meter accuracy can range up to 20%, though almost all of the current-generation and immediately-previous generation meters say that they measure to within 5% of lab accuracy -- at least in the range between about 80 and 200 mg/dl. That suggests 4-8 points difference at the lower range, and 10-20 points at the upper range.

Assuming that all of the meters fall into this accuracy range, I can logically posit that an average of their readings taken at the same time, from the same sample, will come closer to a lab result than the reading of any one meter. Averaged over seven meters, over multiple parallel readings (to weed out any single rogue readings from any meter), any rogue meter should stick out like a bloody fingertip. And any meter whose readings are consistently inconsistent to the average should show up as having a high standard deviation of readings from the average.

  • The good news is that over 10-20 readings (varying from meter under test to meter under test), the Advocate Duo (which I'm not sure is "good" or not) is the only one that deviates more than 10% from the average. (It also has fewer readings, across a known-bad and a possible-bad device, so it may be worth tossing out of tne mix.)
  • The bad news is that my active Freestyle Flash runs consistently lower than the average, to an average of about 9.3% lower with about 6% standard deviation. So now I am wondering if I've just diagnosed a suboptimal Flash...

Well, I have a spare Flash meter I've never needed to put into service. I think I'm going to have to start adding that into the mix to verify that diagnosis.

I'm not sure whether I should toss out the Advocate readings, or if they've been adversely affected by the low Flash readings, or if I should throw out the Flash readings (which I have been using as a baseline)...

Meanwhile, of the tests run so far, analyzing across all the meters originally under test, the Accu-Chek Aviva comes out with the lowest deviation from the average, the lowest absolute-value of deviation from the average, and the lowest standard deviation -- all under 5% in the range tested. (I don't have any good spike/hypo tests from that meter yet, though, so it may change a bit.) After that, while the One Touch Ultra Mini shows up slightly better than the Ultra2 -- using the same vial of test strips, the difference is statistically negligible. The absolute value of their difference from the average is about 5.25 percent, with a standard deviation of just over 6% for the Ultra Mini and about 6-1/3% for the Ultra2. The Freestyle Lite shows higher variability yet, and the Keynote is the other "odd man" of the bunch, coming out consistently higher than the average, to an average of almost 8 percent (7 percent if I ignore the tests that were just the Freestyle Flash and the Keynote). Standard deviation is about 7%. I think the Keynote may also have a higher magnitude difference from the average on forearm tests over fingertip tests, but I haven't done the analysis yet.

Now... before you think of chucking your current meter for a replacement, please bear in mind:

  • I've only tested one of each model (two, if you count the bad Advocate that needed replacement, and the spare Flash which has only received one reading so far.)
  • I have conducted only a limited number of tests across any portion of these meters; fewer against all at the same time, and all on the same subject (me). This makes the data extremely sensitive to bad/rogue readings.
  • The meters have only been tested in close to euglycemic range; their performance under hyper- and hypoglycemia may differ.
  • All the meters came in with under 20% perceived error across those readings, and most came in under 10%. This is within manufacturer specifications.

The data so far, in Excel 97-2003 format, can be found here:meter test.xls.

I still need to do more testing and data analysis, as well as installing software and cables to check out each of meter manufacturers' data management programs.

I am impressed! This must be the part of stats class that I missed. :slight_smile:

Dunno about that, Misty: I never had a statistics class. Basic junior-high and high-school statistics were just range, mode, median, and mean (average); other courses required for my bachelor’s degree (in engineering, many years ago) added in variance and standard deviation, standard distribution, and Poisson distribution (I don’t remember if Maxwell distribution was the same as standard distribution or another distribution entirely, or how to graph any of those distributions). Excel has a lot of functions built into it, so I don’t have to remember exactly how to calculate variance or standard deviation – only that variance is VAR and standard deviation is STDEV…