Beat logo

Through the Looking Glass, and What Google find there in the eye

Or How Google is Using Deep Learning to Diagnose Diseases in Eye Photos

By hiba khanPublished about a year ago 13 min read
2

Google recently published a scientific paper showing how an artificial intelligence model is able to predict a number of systemic biomarkers from a simple photo of the eye.How does it work? How were such results arrived at? Why is it important? We discuss this in this article.The hidden treasure in the eyesimage by v2osk on Unsplashdiagnosis of disease often requires examinations with expensive instruments and then interpretation by a medical professional who is trained. This is not always possible. Not all hospitals have the same instruments and sometimes there is a shortage of specialists. For example, diabetic retinopathy (DR) diagnosis requires a fundus camera that examines the back of the eye. This then requires it to be analyzed by a highly qualified person. This examination can also highlight other conditions such as cardiovascular risk, anemia, chronic kidney disease, and other systemic parameters.It was thought that images, of the fundus of the eye, can be analyzed using machine learning algorithms. However, a paper published by Google in 2017 showed that external photographs of the eye can enable the diagnosis of diabetic retinal disease and detect poor blood sugar control.“Diabetes-related complications can be diagnosed by using specialized cameras to take fundus photographs, which visualize the posterior segment of the eye. By contrast, anterior imaging using a standard consumer-grade camera can reveal conditions affecting the eyelids, conjunctiva, cornea and lens.”. image source: hereIn this work, the authors used photos of 145,832 patients with diabetes from California and additional cohorts. The authors then used Inception V3 (which had been trained previously on ImageNet) for this study, showing:Our results show that external images of the eye contain signals of diabetes-related retinal conditions, poor blood sugar control and elevated lipids. (source)Briefly, the Inception V3 showed at the time state-of-the-art performance on ImageNet (accuracy > 78.1 %). Moreover, Inception was more computationally efficient than previous models. The model reached these results using parallel structures (different types of convolution layers in the same block) and aggressive regularization. In the same article, the authors defined some principles that shaped the convolutional neural network (CNN) field for the following years:Avoid representational bottlenecks. The representation size should decrease gently from the inputs to the outputs.Higher dimensional representations are easier to process locally within a network. This showed to allow us to train faster the network.Spatial aggregation allows a better reduction of dimension without loss of informationBalance the width and depth of the network“A high-level diagram of the model” image source: hereThe authors used classical supervised learning, in fact, they used images of the patient’s eyes as ground truth whether they had a disease (diabetic retinal disease, elevated glucose, or elevated lipids). The model thus trained shows an area under the curve (AUC) of more than 80 percent for diabetic retinal disease diagnosis and prominent (but lower) results for glucose and lipids.Results are surprising, because typically these kinds of systemic parameters can be derived from the front eye, and instead, this first study showed that from photos of the outer eye, the same can be derived through deep learning.In addition, using ablation studies and saliency maps the authors can also better understand why the model makes these predictions:first, the ablation analysis indicates the centre of the image (pupil/lens, iris/cornea and conjunctiva/sclera) is substantially more important than the image periphery (for example, eyelids) for all predictions. Second, the saliency analysis similarly indicates that the DLS is most influenced by areas near the centre of the image. (source)Saliency map. image source: hereablation study: Importance of different regions of the image. image source: hereThese results show that for the growing population of diabetics, some parameters can be measured without the need for specialized medical personnel. In addition, photos of the outer eye could also be obtained using photos with simple cameras.While further work is needed to determine whether there are additional requirements for lighting, photography distance or angle, image stabilization, lens quality or sensor fidelity, we hope that disease detection techniques via external eye images can eventually be widely accessible to patients, whether in clinics, pharmacies or even at home. (source)In any case, at present these models are not intended to replace extensive screening but rather to signal which patients would benefit from further screening (this method is more reliable than a questionnaire).The authors continued to evaluate the model and drew attention to potential biases and inclusions. Indeed, one of the biggest problems with artificial intelligence models in the biomedical field is that if the dataset is not representative of the general population this can lead to misleading results.our development dataset spanned a diverse set of locations within the U.S., encompassing over 300,000 de-identified photos taken at 301 diabetic retinopathy screening sites. Our evaluation datasets comprised over 95,000 images from 198 sites in 18 US states, including datasets of predominantly Hispanic or Latino patients, a dataset of majority Black patients, and a dataset that included patients without diabetes. We conducted extensive subgroup analyses across groups of patients with different demographic and physical characteristics (such as age, sex, race and ethnicity, presence of cataract, pupil size, and even camera type), and controlled for these variables as covariates. (source)The eye, a mirror for the soulimage by Caroline Veronez on UnpslashThe authors at Google and other researchers also considered promising this approach. So they later attempted to extend it to other markers and other diseases.So far, the authors have shown that their model is capable of efficiently diagnosing eye diseases (diabetic retinopathy). On the other hand, there are thousands of diseases, and diagnosing them is complex (locating the right test, expensive instruments not always present, and so on). So the question remains whether the model can also capture signs of other diseases in the image of the eye.Can we extend this approach to other diseases?After all, deep learning models can recognize patterns that are subtle and perhaps difficult for nonexperts to recognize. With these assumptions in mind, Google researchers decided to test whether cardiovascular risk factors could be detected in ocular fundus images.Cardiovascular disease is the leading global cause of death, and being able to diagnose it early could save countless lives. In addition, risk stratification is key to identifying and managing groups of patients at risk. Typically, a number of variables obtained through medical history and different tests (blood samples for glucose and cholesterol levels, age, gender, smoking status, blood pressure, and body mass index) are used to diagnose and stratify patients. Manchmal all the necessary data are not present (as shown by a metastudy).The authors in this study show that it is possible not only to predict some patient characteristics (useful in case some data are not recorded or are missing) such as BMI, age, gender, and smoking status but also parameters associated with cardiovascular diseases such as systolic blood pressure (SBP) and diastolic blood pressure (DBP).“The top left image is a sample retinal image in colour from the UK Biobank dataset. The remaining images show the same retinal image, but in black and white. The soft attention heat map for each prediction is overlaid in green, indicating the areas of the heat map that the neural-network model is using to make the prediction for the image.” image source: hereThe authors have the same Inception V3 model as well. In addition, to deal with continuous variables the authors used binning, basically, they divided the variable into different segments using different cut-offs (for example, <120, 120–140, 140–160, and ≥160 for SBP).In addition, the authors used a technique called soft attention to identify regions that are associated with certain features. In short, soft attention is a method that takes into account different subregions of the images and uses gradient descent and back-propagation (no need to implement an attention mechanism in the model).soft-attention can allow to spot prediction mistakes of the model. image source: hereIn another work, the authors tested with anemia. A condition, which afflicts more than 1.6 billion people and requires monitoring hemoglobin concentration in the blood to be diagnosed (an invasive test,c which can cause pain and risk of infection).Prediction of anemia classifications with deep learning. Image source: hereIn this case, they used Inception V4. A later version of the model that was described earlier (in this follow-up article, the authors describe that the architecture of Inception V3 can be improved by adding residual connections). Inception V4 shows how different types of Inception blocks (in which there are different layers of convolution both parallel and sequential) can be tested and used.. Residual connections. Image source: hereIn this later work, Google shows that the approach is not limited to classification (patient has anemia or not) but also to whether hemoglobin concentration can be measured (regression task). The authors train a model for classification and one for the regression task (Inception V4 which was pre-trained on ImageNet).For the regression, the authors simply used mean squared error as the loss (instead of cross-entropy). The final predictions were made created an ensemble of 10 models (trained in the same manner) and the outputs were averaged to yield the final prediction.“Prediction of hemoglobin concentration. Each blue dot represents each patient’s measured hemoglobin concentration and predicted value”. Image source: hereA glimpse of a wide landscapeimage by Bailey Zindel on UnsplashSo far what researchers have observed has been a few parameters at a time. Typically, though, a blood test allows them to monitor many more parameters in a single exam. Can a model from a photo of the eye estimate a panel of systemic biomarkers?This is what Google tested this year and has just published:A deep learning model for novel systemic biomarkers in photographs of the external eye: a…Ocular sequelae resulting from systemic disease have been well documented and are the basis for globally established…www.thelancet.comObviously, this is not an easy task, partly because when you want to conduct such an analysis there is a risk of finding a spurious and erroneous result (also called multiple comparisons problem). In other words, the greater the number of statistical inferences conducted at the same time, the greater the risk of finding erroneous inferences.example of spurious correlation. image source: hereFor this reason, the authors first divided the dataset into two parts. They trained the model and conducted the analyses on the “development dataset,” selected the nine most promising prediction tasks, and evaluated the model on the test dataset (they still corrected for multiple comparisons).They first collected a dataset that contained eye images and the results of corresponding laboratory tests. The authors then trained a convolutional neural network that takes as input an image of the outer eye and predicts clinical and laboratory measurements. It is in this case a multitask classification, in which there is a prediction head for each task (so that cross-entropy can be used as a loss). The authors decided to select cut-offs for each task (selected in consultation with clinicians).The authors, in this case, used Big Transfer (BiT), a model published in 2020 that when trained generalized well across a range of other datasets. The model, in short, is very similar to ResNet, though for training they used some tricks such as Group Normalization (GN) and Weight Standardization (WS). You can find the model in this GitHub repository.Transfer performance of the pre-trained model, beating the state-of-the-art. image source: hereThe outperformed the baseline model (logistic regression on patient data). Although these results are still insufficient for the diagnostic application, they are in line with initial screening tools (pre-screening for diabetes).Comparison of AUC of the baseline model and the deep learning model. image source: hereIn this and the previous study, the authors acquired images using tabletop cameras (also using a headrest for the patient) and produced high-quality images under good lighting conditions. Therefore, the authors tried to see if the model worked by reducing the resolution.The authors noted that the pattern is robust to image quality, even when images are scaled down to 150x150 pixels.This pixel count is under 0.1 megapixels, much smaller than the typical smartphone camera. (source)“Effect of input image resolution. Top: Sample images scaled to different sizes for this experiment. Bottom: Comparison of the performance of the DLS with image size”. image source: hereIn addition, the authors investigated which part of the image is important for the purpose of prediction for the model. For this reason, the authors masked several regions during both training and evaluation (the pupil or iris, transformed the image to black and white)Results suggested that the information is generally not isolated to only the pupil or iris, and that colour information is at least somewhat important for most prediction targets (source)Experiments masking different image regions or removing colour. image source: hereAs impressive as this article is, it still has limitations. In fact, it is still premature to think that it can be used in the real world. First, the photos were obtained under optimal conditions, and we need to verify the accuracy with photos obtained under other conditions.Furthermore, the datasets used in this work consist primarily of patients with diabetes and did not have sufficient representation of a number of important subgroups — more focused data collection for DLS refinement and evaluation on a more general population and across subgroups will be needed before considering clinical use. (source)Parting thoughtsimage from Saif71.com on UnsplashAs seen in this article, a deep learning model is capable of capturing patterns and information in the eye that is sometimes difficult to diagnose. In addition, diagnosis is often conducted using expensive tools, and invasive testing, and requires experienced personnel. Google has shown over the years that instead, it is possible to obtain similar information using the image of the eye.In the future, many diseases could be diagnosed (or at least pre-screened) using a simple photo of the outer eye. These photos could be captured simply using a cell phone camera. Also, since quantitative results (such as hemoglobin concentration) can be obtained, they could be used for noninvasive patient monitoring.On the technical side, it is interesting how a model such as Incenption V3 achieved results on the first attempt. This shows how transfer learning and convolutional networks are capable. In addition, the authors adapted the models for both classification and regression and multi-tasking classification.However, there are several scenarios open. Certainly, Google plans to expand the dataset (as they wrote in the article). On the other hand, the authors used CNN and could also test several other models such as the Vision Transformers. Also, it is not excluded that in the future they will experiment with a language model that uses patient or doctor’s notes as input (after all, the future is multi-modal).On the other hand, even though these applications could be used where patients do not have equipped hospitals, it also opens up ethical issues. As seen, the model is also capable of predicting sensitive data such as age, gender, lifestyle (smoking/not smoking), and other parameters. This technology could also be used for other, more problematic applications.In any case, these studies open very interesting applications. It is not just Google working on such models. For example, other groups have shown that other diseases such as hepatobiliary diseases can be identified from the eye, and others may be as well in the near future.If you have found this interesting:You can look for my other articles, you can also subscribe to get notified when I publish articles, and you can also connect or reach me on LinkedIn.Here is the link to my GitHub repository, where I am planning to collect code and many resources related to machine learning, artificial intelligence, and more.

60s music
2

About the Creator

Reader insights

Outstanding

Excellent work. Looking forward to reading more!

Top insights

  1. Eye opening

    Niche topic & fresh perspectives

  2. Heartfelt and relatable

    The story invoked strong personal emotions

Add your insights

Comments (3)

Sign in to comment
  • Safia Jamaliabout a year ago

    good

  • Safiaabout a year ago

    thanks

  • Safiaabout a year ago

    wow

Find us on social media

Miscellaneous links

  • Explore
  • Contact
  • Privacy Policy
  • Terms of Use
  • Support

© 2024 Creatd, Inc. All Rights Reserved.