Geeks! Could we build an open-source big data pancreas-brain?

Here's the idea. Please, tell me what you think. Do you know people who could do this? (I'm less interested in regulatory feasibility than in technical feasibility.)

PROBLEM: Every day, several times a day, we take our lives in our hands when we decide on insulin, food or exercise activities. No one is smart enough, certainly not me, to use even the limited data sets we have to effectively or accurately treat ourselves. Poring over charts and graphs is not my forte. There are way too many variables (e.g., food type, amount, mix, recency; time of day; exercise type, amount, recency; stress level; insulin type, amount on-board, basal levels, bolus levels; personal data like weight, BMI, and work environment, etc.), they are all in constant flux and their relationships are often (apparently) inconsistent.

SOLUTION: Crowdsource, anonymize and aggregate all that data from Type 1 geeks and smartphone users globally. Analyze it for patterns, and feed relevant insights back to individuals in the crowd for them to use to help them decide what to do now - in the situation they find themselves in. Should they eat, bolus, walk, run? And how much of each?

  • EXAMPLE: I'm a 50-year old T1D, weight 190#, have a sedentary job; I just ran a mile; I last took a bolus of Novolog 5 hours ago and my current pump basal rate is .5u/hour. My BG is 177 and dropping (per my CGM). WHAT SHOULD I DO NOW? In its simplest incarnation, our system would search for similar scenarios in the global data set, as well as similar scenarios in my personal data set, and let me know how similar they were and what happened in those cases when various actions were taken. For example, the next action might have been:
    • Ate a pint of ice cream->BG steady for 3 hours;
    • Drank 8 0z. of juice->BG up 25 points then down 45 after 3 hours;
    • Etc.
  • Based on this analysis, I will be able to make a much more informed decision as to what I ought to do next. Certainly way more informed than I am today.

APPROACH: We could collect much of this data: some passively, some actively. Let's assume that there are smartphone apps that can passively gather some exercise data (pedometers, Nike+), that the new Sanofi/AgaMatrix can gather BG test data, and that appropriately incentivised users would employ exercise and diet apps apps to share food, stress, personal and exercise data points. With all this data from thousands of people posting more data every day, I have to believe that we would all benefit.

Could we do this? How? What are the hurdles? How could we get over them?

I suppose that what I'm looking for is a system that acts as the nurse educator or endo 24/7, combining my recent data and the experience of hundreds of other patients to make a recommendation. Without learning from other patients, our diabetes team would be no better at predicting results than the rules they originally learned when they trained. Similarly, a system that combines input from thousands of users and millions of data points should be pretty doggone accurate.

As well, a fixed algorithm or a single-patient-model assumes that 1) I'm like this 'average' user, which no one is, and 2) it doesn't learn from experience.

It seems to me that crowdsourcing solves all those problems. You could definitely start with a base algorithm, but as data accumulates, the system should adjust and iterate. (It may learn, as a tiny example, that people over 188 pounds consistently require 8% more insulin than the average user.) Over time, when a user queries the system as to "What do I do now?", he/she would get a set of examples that are increasingly representative of his specific variables. It's a neural network. It's more a Google approach than an expert system approach.


I do agree that the system should only suggest, and would require that the user decide what to do. Also, having a large base of data reduces the chance that an error in data input would dramatically affect reliability. In fact, the system could suggest that you recheck your input as it seems not to align with patterns among the other 2,000 users.

Finally, it seems to me that data presented in this kind of context would be much more accessible to the average user (some of whom are teenagers with very poor judgment!) than are charts and graphs available today.

I haven't read Pumping Insulin, but I just ordered it on Amazon.

How about this thought? First assume you can get accurate GCM data up to a few minutes ago and your insulin dose is nearly as fast (fast reaction and duration).

With the CGM and insulin doses, the pump could do a sliding basal rate based on the graphical trend of your blood sugar. A sharp increase requires more insulin and as it decreases it shuts off the insulin (similar to a heating thermostat behavior keeping the temperature within a set range).

Since the present CGM technology is slower and an insulin bolus is slow and lasts hours after dosing, you need a way to shutoff the insulin, i.e. glucagon. Maybe design a delivery with a glucagon reservoir too. A faster blood sugar trend would use more up front insulin while slower trend less.

That's a cool idea.I really like the idea of the bolus being reduced if your BG is trending up and vice versa. And the idea of counteracting insulin with glucagon is also interesting. I wonder if anyone's tried that. Since our natural glucagon delivery system is still working and responding to drops in BG levels, I wonder how one could calculate the proper dosage to counter the insulin over-delivery.

There are two reasons my concept is not connected directly to the pump. One is that I'm not sure I trust the algorithm to actually deliver my bolus. I kind of like to be able to say "go" or "no go." The second is that it requires cooperation with the pump manufacturer and is likely to be regulated. The idea I'm proposing would be grass roots development, open source code and on a peer-to-peer network, so that it doesn't risk anybody getting sued. People can us the info if they want, but if they do, it's not a company providing the info; it's all of us.

OK folks. 31 of you have viewed this post. Can anybody actually contribute to building it? Design the database? Develop the initial algorithms? Build the app? Help find funding?

From a statistical standpoint having data from many diabetics will lead to more stable predictions. Could the data be collected in the cloud?

Hi Craig, I have just read your idea and I like your analytic approach using statistics & crowd sourcing. Now, for these statistics to be useful, I guess, ethnicity, age, weight, age at which diagnosed, etc, etc need also to be considered.

I am ready to spare some of my time working on this in partnership with someone to take a more educated approach..(technically qualified to design the algorithms, etc).

Regards,
Sai (from India)

I agree with your point about other data that may need to be collected. The beauty of this idea is that the data that actually drives accurate predictions will become apparent as more predictions are generated and results fed back. For example, if "age at diagnosis" show no correlation with predictive accuracy, we learn that it's irrelevant. Or, we may find that it's only relevant during the first 2 years after diagnosis, or after 24 years, or among women. We don't know until we gather the data and feed in the results.

What kind of skills do you have, Sai?

Hey Nate, in my first reply I addressed every point you made except the most important one. Is there any skill, anything or anyone you could contribute to making this idea come to life? I'm definitelynot expert in any of this, but I'm guessing we'd need (at the least) someone to design a database, plug-in some pattern recognition software, a way to generate algorithms and add some sort of simple AI capability to analyze new, incoming data and feed back new rules into the algorithm. And that's before we deal with issues like sourcing some of the data from existing apps, designing a frictionless input interface and finally, getting people to sign up and try it.

If you're a scientist at Woods Hole, I'm thinking some of these people we need may be friends of yours. If not, do you know how, as you mentioned, to appeal to the open-source community (i.e., via what channels)?

Anybody else have ideas?

Jann, I completely agree that the more data, the greater the stability and the higher the confidence levels regarding the predictions.

if you mean could this be a service that sits in the cloud rather than a piece of software and database on your smartphone, absolutely. That's the idea. The are many cloud services providers who could accommodate this service, either as a private cloud or on shared servers.

I can help with the programming part. I am Software engineer by profession and right now I have spare time. I am not sure if someone could come up with a "Design Document" based on your analysis (posted above) which is a pretty good start (I guess!). One of the things useful could be building a Database first...Moving this onto a Cloud or other platform could be the secondary step. I can probably work on the database design for now (probably using open source database like mysql or something similar).

Check out the presentations by John Walsh at diabetesnet.com (he wrote Pumping Insulin, the 5th edition is out now) : Diabetes Presentations

In this poster: DiabTech2007Poster.pdf he presents anonymous data from over 500 Cozmo pump users and the data on the pump settings is very interesting. Most of the settings are not normally distributed as would be expected. A large number of pump users have "magic number" parameters and therefore are likely not optimized.

I think that this kind of data is good for getting a 1st approximation of settings for an individual user (total daily dose, correction factor, carb factor, etc. for a given weight, BMI, age, sex, time after diagnosis), but beyond that I think that optimum parameters are extremely specific to an individual, and that is where the value of a software tool lies.

I would imagine a careful iterative process of say two weeks of data analyzed, and then small adjustments to basal programs, carb factors, and correction doses, followed by another 2 week period and adjustments.

Another issue that Walsh addresses is that the different pump manufacturers have different ways of calculating bolus recommendations: AACE2007Poster.pdf therefore a software tool would also be pump specific.

More info:
http://www.diabetesnet.com/diabetes-tools
http://www.diabetesnet.com/diabetes_tools/pumpsettings/
http://www.opensourcediabetes.org/

And these free papers:

http://www.ncbi.nlm.nih.gov/pubmed/19888377
http://www.ncbi.nlm.nih.gov/pubmed/19888378
http://www.ncbi.nlm.nih.gov/pubmed/21303635
http://www.ncbi.nlm.nih.gov/pubmed/20920437

And not free (contact me):
http://www.ncbi.nlm.nih.gov/pubmed/19158048

Mark, thank you for pointing me to so much great information. I'm still working my way through it. So far, I found the OpenSourceDiabetes.net Optimal Insulin Pump Settings Tool to be really interesting and the closest to what I'm talking about.

To me, the challenge is not in understanding what general regimen or rule-set will work for the average person most of the time. The challenge is to recommend specific action(s) in realtime to a person with particular characteristics in a defined 'current' situation. Basically, answer the question "What do I do now?" Like having an endocrinologist on your shoulder whispering in your ear. So, all the general "regimens" and "rules" could be written into the program as a hypothetical foundation, but they will be used only to the extent that they are the best set of rules for the user's scenario. Otherwise, the system will infer a set of rules on the fly, based on the huge base of data. Based on those rules, it will be able to offer recommendations.

It seems to me that the fundamental differences between medical research regarding insulin regimens and the concept of a "virtual pancreas" are in three areas: 1) the continual aggregation of data into a shared warehouse (rather than analysis of one-time batches of data), 2) frictionless, realtime collection of data that is not limited to the variables monitored by a CGM+pump+meter (i.e., exercise, stress, physiological info, disease history, etc.) and finally, 3) that the output is meant for use at a given instant in time (not as a learning tool for future consideration or analysis). But most importantly, it is not structured as a theorem leading to a proof and a fixed algorithm, but as a query to a database to solve for a particular scenario, based on patterns detected in the aggregated data. And data from the individual could be incorporated into those calculations with greater weight.

Bottom line is, I think that people with diabetes (and their caretakers) should be provided with simple, crystal-clear support for the many day-to-day decisions they face in managing their diabetes. We need to eliminate gut feel, guessing, compliance, calculating, charts, and graphs and ratios. My suggestion is that this decision support would take the form of specific recommendation options accompanied by rationale and/or examples. As I described above.

That's the idea. I believe it can be done and, as was done at OpenSourceDiabtes.net, I'd like to get it built and see how it works.

I have had the same thought about what we could do with all that data, although I think there's a lot of steps to getting there. I worked on making a diabetes management application for my thesis project at college and then just needed to go back and start again.

Right now I'm working on the idea of creating a standard way to store diabetes information so that it can be easily transferred from one program to another: http://sanguinediabetes.com/opendatabetes

Yes, we're working on this:

* https://github.com/bewest/decoding-carelink/commits/rewriting
* https://github.com/bewest/insulaudit/tree/master/hacking

We have 3g-enabled prototype, need to get the protocol to be more robust.
Please help. Will hopefully add pictures and more instructions soon.

We are working in conjunction with DUBS https://github.com/bewest/diabetes-understanding-by-simulation / http://code.google.com/p/diabetes-understanding-by-simulation/ to build predictions based on similar events.

More I keep on thinking, makes me wonder if it's possible to make a super pump using an Andrino board of a Raspberry Pi. Looking at the components of a normal pump they are pretty basic and can be picked up for less than $50 off the component sites. If it was possible to get some kind of CGM system, maybe a dual chamber pump (some are appearing on the market now and as expected are silly money) for both insulin and glucagon, maybe a heart rate monitor to spot exercise, blood pressure monitor maybe for stress gathering info, and heck any other sensor that's needed possibly temperature to bring in environmental variables. It would be possible using a processor like those on the Andrino or Pi given enough data it would gather, the possibility of doing this kind of thing. Or am I just over simplifying things here?

Heck, an open source pump reference design. Talk about a great way to shake up the over charging medical companies and get something in the hands of everyone.

Here's a sensor platform for Raspberry Pi/Arduino: http://www.cooking-hacks.com/index.php/ehealth-sensors-complete-kit-biometric-medical-arduino-raspberry-pi.html

Ohhh now that's the thing! Thanks Morgdan. Will have to see about getting one of those to tinker with and possibly order up some parts for a pump mechanism to add to a board.

Noticed yesterday in the press, how they were basically using a normal pump with some new software which predicts random events (yes that sounded odd at the time and looked even worse when I read up on the maths behind it) to deal with dosing for a pump connected to a CGM. The issue they seem to be causing hypo's, which is quiet possibly down to as we all know the inaccuracy of CGM's and the number of outside factors which make one calculation to rule them all for working out doses not really possible. Am quiet surprised a project like this hasn't looked at other sensor information though.

Artificial Pancreas in news: http://www.cam.ac.uk/research/news/artificial-pancreas-promise-for-common-diabetes-complication

Good suggestion. I've documented it here: https://github.com/medevice-users/diabetes#hardware

Please suggest/add more. I have some arduino stuff burried in the insulaudit project. For beaglebone, we have an open-embedded image, and we have an H8 OE build kind of stubbed out ;-)

We really are doing this... we can change the names per suggestion, but there are definitely more than enough people actually hack the full stack, and we'll need even more people to help document it all. Regardless of your expertise, we're going to break down all these problems so that one of them is a well formed question for your particular expertise.

Hello Craig and fellow geeks with diabetes!

I am a type 1 with a background in consumer research for product development, knowledge management, and information mapping. This topic elicited a WOW! response in me when I read it. It is so well considered. I believe if we put our heads together, we can create something within the next year that will yield the results we seek.

The challenge to this, after some deep diving into the nitty gritty and speaking with stakeholders at the device companies, research programs involved in algorithm development and models of virtual patients for simulations, and software developers, seems to be a knowledge brokering one. In sum, it looks like we could pool our knowledge to get this done. We have the technology, the expertise and the user requirements between us. Plus, others in the field have done some compelling work to develop this type of solution (for instance, 4DSS - Wiley, Marling, Schwartz et al).

I have devoted my masters thesis work (to conclude shortly) to this need, focusing on knowledge transfer and coordination of stakeholders (we the patients!!! and others) in order to map out the tools, knowledge, and people that would need to come together for such a concept to be brought to fruition.

I am also working on concepts to submit for the Sanofi data design diabetes challenge (deadline April 7). http://www.datadesigndiabetes.com. The basic ideas supporting the concepts are:

1. We have the knowledge and technology to achieve smarter data management software that actually helps us make decisions as Craig describes above.

2. There is currently too much of a tradeoff between lifestyle flexibility, self-management effort, and good results. To live life the way we want, whether as a foodie, an athlete, or any type 1 diabetic looking to develop strategies for situations from responding to emotional stress or understanding the impact of disconnecting for a long bubble bath. We need the ability to run self-experiments and obtain intelligible feedback in the form of integrated data in order to develop personal strategies. Success in this disease depends on patients' ability to self-manage, as well as minimizing the opportunity cost of managing our disease. We need smart software to free us from the burden we currently face.

3. We need solutions that are compatible with human intelligence and the real scarcity of attention and time to deal with this. Current software is not used by the majority of insulin pumpers, and obstructs both our ability to develop adequate understanding of the disease in our own bodies and our ability to collaborate with our physicians.

My questions for you are:

Where are we in understanding the application of technology (such as the use of Arduino or Raspberry Pi-style boards to enable sniffing of data from our devices, machine learning or big data analytic tools to help us make sense of our data, or apps to self-track with speed, accuracy and convenience (e.g. quantified self-style tools that would help us get around duplicate and triplicate work to document the variables we personally need, whether it's food intake, sleep, medication, activity, et cetera)?

Would you be interested in contributing any knowledge to the development of such a software system in the next year (or joining a crack team of user researchers, data visualization experts, statistical programmers or engineers in order to combine forces)? If so, what knowledge do you possess? What knowledge would you like to gain from others to help you in any efforts you may have underway?

What questions do we have in this community that, if answered, could help us collectively or separately tackle this challenge?

Thanks to all, and all the best!

Elsa