About usefulness of early detectors

When one claim to design an early detector of some illness, it is important to think about early detection in healthcare in general, what are the benefits, inefficiencies and the new problems it creates.

As a personal anecdote about usefulness of early detectors, I had a skin carcinoma, that two doctors (the family and the company MDs) saw without reacting for years, one of them even asked me what it was. At the end it was a cardiologist who told me it was probably a carcinoma and that I had to consult quickly a specialist.
MDs have to know what to make of the tests results of those devices. For example some medical organizations start to provide free kit for genetic screening for some conditions [0], as we know some drugs work well for some genomes but less for others, which is a concept a bit weird in itself but very fashionable at the moment.
But those kits do not work the same way, so their results are not comparable with each others, some may analyze the DNA in blood, while others may take a sample with a biopsy needle. Neither can claim to capture the full picture of the tumor’s mutations. In addition tumors’ genome evolves very quickly and is not homogeneous, it is as if many mutations are branching out quickly from a common ancestor cell. At some time later a tumor is the site of several unrelated mutations.
Some tests sometimes provide conflicting or overlapping results from the same patient. Researchers at the University of California, San Diego, published a 168-patient study on discordance in early 2016 that shows there are overlap as well as differences, between DNA analyses from biopsies of tissue and blood samples.
Some tests even make suggestions for drugs, studies have shown that different commercial solutions may in some cases suggest different drugs, or do not suggest drugs that a MD would have prescribed. Those commercial products need to improve, and doctors’ professional bodies need to develop guidelines to teach how to cope with those new tools.

Another thing is the false negative, the press reported recently an unfortunate case where a women felt something was wrong with her baby, in the last months of her pregnancy. She then used a fetal Doppler and found an heart beat, unfortunately the baby was stillborn. It is possible that if she had not used her fetal Doppler, she would have gone immediately to her hospital which may have saved the baby.
False positive are another problem, as an older man I am regularly reminded to check for PSA by the state health insurance, PSA (prostate-specific antigen) is a marker of prostate cancer. I am aware of the risk of cancer, but two large studies, one in US and two in Europe told that for a thousand people screened positively, one man will probably be saved, but several dozens will suffer severe degradation in their life quality and health in general.

The testing process may also induce traveling cost for the patient, lost of time and revenues, incomfort or even suffering, especially in women healthcare. Unnecessary biopsies and other medical procedures for people who are wrongly diagnosed or whose cancer might never have spread, can also hasten health problems.
While early detectors might seem a good idea in general, one problem is the anxiety they generate, for example even if everything is right, it does not mean everything will stay right in the future so there is a constant urge to re-check. Even medical doctors could succumb to cognitive bias, when they find “something” in mammography, then ask for more tests which are negative but nevertheless urge to have more frequent testing in the future, creating unnecessary anxiety for the patient[1].

What does all this mean for a designer of an early detector of heart failure? Certainly that there is a need to not make big unwise claims. There is also a need to collaborate with real doctors, not only scientists.
At the same time how to attract attention of people to make them use it and finance R&D ?

[0] http://www.xconomy.com/national/2017/05/31/in-maine-making-cancer-dna-tests-free-and-asking-tough-questions/
[1] https://blogs.scientificamerican.com/cross-check/why-we-overrate-the-lifesaving-power-of-cancer-tests/

Refactoring and randomness test

The first usable versions of our features detection code (findbeats.java) were full of hardwired constants and heuristics.
The code has now been modularized, it spreads several methods with clean condition of method exit.

We were proud that our design was able to look at each beat and heart sound, which is a far greater achievement than what ML code does usually. Something really interesting was how we used compression to detect heart sounds features automatically in each beat.
Now we introduce something similar in spirit: Until now sometimes our code was unable to find the correct heart rate if the sound file was heavily polluted by noise. Now we use a simple statistical test akin to standard deviation, to test the randomness of beats distribution. If it is distributed at random, then it means our threshold is too low: We detect noise in addition to the signal.
This helped us to improve the guessing of the heart rate.

In an unrelated area, we also started to work on multi-HMM, which means detecting several, concurrent features. An idea that we toy with, would be to use our compression trick, at beat level, whereas now it is used at heart sound level. This is tricky and interesting in the context of a multi-HMM. Indeed it makes multi-HMM​ more similar to unsupervised ML algorithms.

Multi HMM for heart sound observations

Up to now the feature detection has used something that I find funny, but it works really well. As we use Hidden Markov Models, we must create a list of “observations” for which the HMM infer a model (the hidden states). So creating trustable observations is really important, it is a design decision that those observations would be the “heart sounds” that cardiologists name S1, S1, etc..

In order to detect those events, we first have to find the heart beats, then find sonic events in each of them. In CINC/Physionet 2016 they use a FFT to find the the basic heart rate, and because a FFT cannot inform on heart rate variability, they compute various statistical indicators linked to heart rate variability.
And its not a very good approach as the main frequency of a FFT is not always the heart beat rate.
Furthermore this approach is useless at the heart beat level and indeed at heart sound level. So what we did, was to detect heart beats (which is harder that one could think) and from that point, we can detect heart sounds.

Having a series of observations that would consist only of four heart sounds, would not be useful at all. After all a Sn+1 heart sound, is simply the heart sound that comes after the Sn heart sound. We needed more information to capture and somehow pre-classify the heart sounds.

It was done (after much efforts) by computing a signature based somehow on a compressed heart sound. Compression is a much more funny thing that it might seem. To compress one has to reduce the redundant information as much as possible, which means that a perfectly compressed signal could be used as a token about this signal, and logical operations could be done with it.

Sometimes people in AI research fantasize that compression is the Graal of machine learning by making feature detection automatic. We are far from thinking that, as that in order to compress one has to understand how the information is structured, and automatic feature detection implies that we do not know its structure.

It is the same catch-22 problem that the Semantic Web met 10 years ago, it can reason on structured data but not on unstructured data, and the only thing that would have been a real breakthrough was reasonning on unstructured data. That is why now we have unsupervised Machine Learning with algorithms like Deep Forest. While Cinc 2016 submissions used heavily unsupervised ML, we used compression (Run Limited Length) to obtain a “signature” of each heart sound, and it works surprisingly well with our HMM.

The next step is to implement a Multi-HMM Approach​, because there are other possibilities to pre-categorize our heart sounds than its RLL signature, for example the heart sound might be early or late and that characteristic could be used to label it.

Heart beat detection and segmentation

This is a quick description of our early detection of heart failure algorithm for feature identification and segmentation.

1) A sound file consists basically of a float array and sampling rate.
2) One normalizes this sound in amplitude (but we can do without) and in sampling rate (2000 times per seconds)
3) Contrary to what is done in Physionet 2016, there is no filtering or elimination of “spikes”.
4) The cardiac rhythm is detected through events that are roughly the S1 events (FindBeats.java). This is not trivial as there are noises, spikes, “plops”, respiration, abnormal heart sounds, human speaking and in one case even dog barks! Basically there are two ways to achieve the beat detection, one is to find the frequency of the envelope of the signal, the other is to make a FFT. But both approaches are subjectives as they imply to decide beforehand what is an acceptable heart rate. What does not help is that heart rate can roughly go from 30Hz to 250Hz, so as we can detect three or four main frequencies between 30Hz to 250Hz in the FFT of the sound file, which one is the correct one? One cannot decide in advance. What we do is having a two steps and indirect approach that fails quickly if we make a wrong guess. So it could converge quickly toward a result:
* In the first step we try to estimate the heart beat duration independently of the heart rate (because there are noise sources that make the heart rate counting unreliable).
* We use this first estimation to make a better guess of the heart rate in the second step. If the guess gives obviously wrong results we change the threshold for detecting heart rate.

4-1) One seeks to find the optimal detection level (threshold) enabling to detect heart beat rate in this sound file.
4-1-1) We create a window on the sound file, that is very low-pass filtered (currently 200Hz)
4-1-2) We register times of downward passage through the current treshold level
4-1-3) If a convincing heart rate (not necessarily exact) is obtained, the level of detection is kept, otherwise it is lowered and the procedure is repeated.
4-2) With this threshold, we will try to obtain a cardiac rhythm with roughly the same procedure as above but with a window filtered at 1000Hz and in a wide interval around the heart rate obtained in the first step.
4-2-1) We create a sound window which is low-pass filtered (currently 1000Hz)
4-2-2) A downward passage through the threshold is detected, ad hoc margins are imposed between the start and the end of the event.
4-2-3) If a convincing cardiac rhythm is obtained, the heart rate is kept, otherwise the level of detection is lowered and we start again

So we have here an approach which is very progressive, yet delivers results in a short time. It is also quite insensitive to sound events that could derail the heart beat counting, as the first step provide some good indications of the area where is the real heart beat, one spike makes the closer S1 perhaps not detected, but we gain a good idea of the heart duration.
5) In addition to S1, many other events are detected. A priori we assume that these are the other events S2, S3, S4. Even numbers going higher than four, it is useful for unusual heart sound classification.
6) These two Sx events detections are brought closer together
6-1) The list of S1 events is made more reliable, which makes it possible to deduce the events S2, S3, S4 (Segmentation.java).
6-2) A signature of the heart beat is computed to acknowledging there are more in one heart beat than its time of arrival and duration. We tested several schemes, and decided to use a Huffman compression of the heart beat. We had also the idea to use this to yet another kind of feature detection without training but it is not implemented at the moment because of lack of resources.
7) From there one can either train a HMM, or classify. We go out of cardiac specific, it’s just an HMM
8) One interprets the classification made by the HMM, with a score of similarity and comments on the events Sx

Another difference with Physionet 2016 is that they add a second approach, which is the variability of cardiac rhythm, they calculate a lot of indicators, but the HMM does it more accurately, with a probability of Transition from internal to internal state that is explained, instead of being a scalar computed over the whole file.

In a future version, I would like to work on the frequency peaks that can be identified with an FFT.
The general idea would be to look for what would be “apparently” harmonics but which in fact would indicate echoes on different paths.

Heart sound files analysis

Not much to show, but some news:
Sounds files have problems that I did not anticipated. What I was expecting from the analysis of Physionet 2016 submissions was noise, spikes, weird amplitude and similar distortions of the signal.
What I found was different, there is little noise while you filter it a bit, there are few spikes.
However sometimes the signal is biased (more negative values than positive), the signal also appears to have little in common with textbooks, I can easily detect S1 and S2 events, but it is difficult to find S3 and S4.

When you hear the sounds, half of them looks weird, I am not a cardiologist, but I find it difficult to find what I could hear in a “textbook” heart sound.
This makes me think again about the Physionet 2016, successful submissions where mainly about heavily filtering, dealing with spikes with sophisticated algorithms and finding characteristics (features in ML slang) that encompass the whole file such as RR variability as in:


Clearly my approach is different, I focus on what identify a heart beat, which is entirely new. But I still plan to implement the RR variability analysis and tied it to my HMM classifier which will become quite hybrid in the process.

Early and low cost detection device for Heart Failure

Six months ago we registered a new project on Hackerday and some other places.

The idea was to detect early the heart failure condition, because it is a condition that affects most of us as we age, and there was a lot of material online thanks to various challenges about this subject, like this one:

To create a proof of concept we used a low cost fetal doppler ($50) and a Linux box and were able to record heart sounds without using gel on an adult. So one of the requirement for medical devices was filled: To be ready to be used in seconds.

In most medical devices, there is an implicit requirement: To make the output understandable, it must offer an explanation of the medical statement. So using a black box ML à la Kaggle, is out of question.

In heart sound competition like Physionet 2016, they train HMM in order to create a statistical model of the heart sounds of some condition, HMM can “explain” their internal model by showing the probability a appearance of each state, for example the probability of arrival of a S2 sound at some time after a S1 sound in a particular sequence of heart sounds.
So a HMM model can be used to classify a new sequence of heart sounds either as quite similar to the trained model or not.

One might ask why not using deep learning as it seems to have made wide steps forward recently and as very nice software are available like TensorFlow.
There is a big internal difference between ML using CNNs à la TensorFlow and ML using HMMs, in “an ideal world” a CNN finds its feature without human intervention where a HMM needs that each observation is “tagged” with some human knowledge with a Viterbi or similar function. The tagging is part of what makes the resulting model understandable, however automatic tagging (as in unsupervised learning) is indeed hard.

In truth there is similarity between the design of successful CNNs and HMMs, they have a cost function, however CNNs cost functions do not create meaning.
Designing the cost function of a CNN or the Viterbi function of a HMM is the most important part of any ML setup. All claim that we heard about effectiveness of ML are due to the design of those functions, not to some fancy ML algorithm.
It is a very hard job, far above the state of art.

In order to circumvent this problem, most ML proposals use another ML setup to create the cost function as in most Physionet 2016 challenges or in a recent article that is highly considered in the domain of skin cancer detection: http://www.nature.com/nature/journal/v542/n7639/abs/nature21056.html.

Indeed if one uses ML to create the cost function, the resulting model becomes highly opaque, and medical policy maker, scientists, or specialists will find it useless or even dangerous.

On the long term this practice of using a ML derived cost function will be discouraged, but I suppose this is part of the current hype curve about “deep learning”. It is worse when using small signals ML like in Deep Forest algorithm, where it becomes impossible (today) to reverse engineer the ML model by disturbing it. In addition deep learning cannot be done with a $50 device, it mandates huge computing facilities.

So we created our own Viterbi function for our HMM, and it is quite efficient while being quite simple. The next steps are to improve it, make it more informative and able to move from the Linux box to a microcontroller. Stay tuned.

Aida as a tool to assist and audit the science process

DARPA’s new AIDA program may (also) help to provide better understanding of science publications and results, by helping separating out interesting from irrelevant data.

Information complexity has exceeded the capacity of scientists to glean meaningful or actionable insights and doing science is increasingly more difficult as time passes. World class experts in one field will not understand statements made by scientists, even if only slightly outside their field.
The situation is even worse for other science stakeholders, such as scientific managers and policy makers who have a long interest in developing and maintaining a strategic understanding and evaluation of the scientific activity, field landscape, and trends. Information obtained from scientific publishing are often analyzed without their contexts. Often because of the complexity and superabundance of information, independent analysis results in interpretation which may be inaccurate.
It would be interesting to overcome the noisy, and often conflicting assertions made in today’s scientific publishing environment through a common tooling. Some efforts have already been done, for example the excellent Galaxy tool in Biology and to a less extend Notebook interfaces like Jupyter in coding. Another interesting trend is the pre-print activity which helps share information unsuitable for publishing with other scientists.
DARPA’s AIDA program aims to create technology capable of aggregating and mapping pieces of information. AIDA may provide a multi-hypothesis “semantic engine” that would automatically mine multiple publishing source and extract their common foreground assertions and background knowledge, and then it will generate and explore multiple hypotheses that will interrogate their true nature and implications.
The AIDA program hopes to determine a confidence level for each piece of information, as well as for each hypothesis generated by the semantic engine. The program will also endeavor to digest and make sense of information or data in its original form and then generate alternate contexts by adjusting or shifting variables and probabilities in order to enhance accuracy and resolve ambiguities in line with real-science expectations.
Even structured data can vary in the expressiveness, semantics, and specificity of their representations. AIDA has the potential to help scientists and science decision makers refine their analyses so that they are more in line with the larger and more complete overall context, and in doing so achieve a more thorough understanding of the elements and forces shaping science.

Low cost, non invasive, Continuous Glucose Monitoring utilizing Raman spectroscopy

A high quality, low cost, non invasive Continuous Glucose Monitoring (CGM) based mainly on Raman spectroscopy, is presented. In addition a number of sensors provide information about patient’s context. The CGM re-calibrates itself automatically.

Designing non-invasive continuous glucose monitoring (CGM) is an incredibly complex problem that presents a number of challenging medical and technological hurdles. It is told that around 70 companies tried to bring non-invasive glucose monitoring devices to the market without any success.

Quality in our CGM proposal comes from the number of technologies used to increase measurement precision. The understanding of the biological operating context enables to accurately predict glucose values.

More information here: glucose_monitor

Analysing eyes’ biomarkers at home with passive infrared radiation.

Currently there is no portable device that can check diseases of the aging eye such as: Glaucoma, age-related macular degeneration, Diabetic retinopathy, Alzheimer’s Disease, Cataract, clinically significant macular edema, keratoconjunctivitis sicca (dry eye disorder), Sjogren’s syndrome, retinal hard exudates, ocular hypertension, uveitis.

We propose a portable device which when placed before one eye but without any physical contact, analyzes its natural infrared spectrum in order to detect molecules that reveal a potential medical condition. If a biomarker is detected, the device asks to the user to consult a medical doctor, with an indication about urgency but without disclosing any medical information. On contrary the doctor can securely access a wealth of information without needing a dedicated device.

The medical doctor proposes this tool to the patient, and is constantly in control of the device and the relation she has with her patient.

More information here: passive_eye_care

Mesosphere light scattering as Cell tower substitute.

Modern wireless technology can’t transmit energy and information with a good enough SNR, over 80km and over earth curve, in portable low cost devices with current regulations.

We propose a very different approach based on astronomy technology, where a laser emits light vertically, generates a luminous dot at high altitude (similar to astronomy’s guidestar) and this light is detected at very long distance. By modulating the luminosity of this guidestar, it is possible to transmit information. This technology works even if the sky is cloudy and in daylight.

There is no need to build any infrastructure network. Each cell in a field can access the base station even at 80km. The cost per field station is less than $9,000. Field stations can be moved at will.

More information here: base_station_for_deserts