|
RailwayBridge.co.uk
Medical Applications for Pattern Classifiers and Image Processing |
The following article is derived from a chapter in my MSc dissertation One-class pattern recognition by compound classifier. This C-based neural network project was undertaken at the Department of Computer Sciences, University of Wales, Cardiff, UK, in 1998. I hope to eventually add the rest of the dissertation together with related information. I'm always happy to hear comments, ideas, suggestions, even criticisms, so please do email if you have any interest in this field. For true expertise across the whole area of machine vision and pattern recognition contact my MSc supervisor Professor Bruce Batchelor.
© 2000, Andrew Tomlinson: All rights reserved. No portion of this document may be reproduced, copied or revised without written permission of the author.
This section considers the role in medicine for pattern classifiers and vision inspection systems. To illustrate the points raised, three potential applications - monitoring neonatal jaundice, breast cancer screening and cervical smears - are discussed as examples.
Medicine revolves around classification and recognition.
Disease diagnosis requires the interpretation of multiple parameters. The same is true for recognising asymptomatic patients who are at risk. Having made a diagnosis, the clinician is then faced with a choice of treatments, some of which will better suit certain patients. Optimally assigning treatment ensures a better prognosis and fewer side effects, leading in turn to a more economical use of resources. These are areas in which the classification and recognition of data patterns is vital.
The scope for pattern classifiers in medicine is vast.
An early pattern classifier was built in 1973 in the London Hospital to predict the outcome of cerebral anoxia caused by cardiac arrest. This cerebral anoxia classifier, built by D E Maynard, is now on display in the London Science Museum [BAT74]. Other cited clinical applications are:
My own work in medical research has convinced me of the benefits to be gained from a practical pattern classifier. I could cite many studies to make my point, but one typical example should suffice. In 1993 I was involved in a study that used demographic details, medical history and questionnaires to try to determine general practice patients at risk through not receiving immediate chiropody [GRI93]. The study was reasonably successful in that some key determinants were identified. But a practical classifier, had one been available, may well have enabled at-risk patients to be easily identified from the collected data.
Medicine, like industry, needs intelligent,
flexible vision systems.
Medical image analysis is seen as enjoying reasonable success where quantitative issues are involved but only limited success where interpretive or diagnostic issues are involved [UND96]. Systems may, for example, be able to determine the number, size and distribution of retinal microaneurysms at an acceptable accuracy, but intelligently mapping this information to a specific disease is much more challenging. This desire for intelligent vision systems is seen also in industry. Here, machine vision systems are shifting from defect detection to defect prevention [BAT97].
There is a requirement in industry for systems that do not require costly reprogramming between production runs. This requirement, too, is desirable for medical vision systems. For example local or short-term screening programs require cheap, flexible image analysis systems if they are to be cost effective. Costs are reduced, and utility increased, when systems can be developed and fine-tuned by domain experts rather than systems engineers. This is true regardless of whether the application is for industry or medicine.
Indeed, it is probably misleading to distinguish between industrial and medical systems since their imaging processing techniques (such as histogram equalisation; smoothing; median filtering; edge detection and thresholding [NAZ95]) are common. Clearly, therefore, it is important for the medical community to consider the generic industrial vision systems, such as Cardiff University's Java-based Cyber Image Processing tool-set [KAR98], which are currently being developed.
Colour has not been widely used by classifiers,
but it could be used to classify cells and tissue.
Studies into the use of colour as a classification parameter have so far been limited, possibly for reasons similar to those cited by Batchelor for the limited use of colour image processing in industry [BAT97]. These reasons are shown by Batchelor to be largely no longer valid. They centre on the perceived greater complexity, cost and imprecision of colour image processing systems compared with monochromatic systems.
One study which has shown the potential of colour processing is from Pavlova et al [PAV96]. They found that characteristic hue histograms exist for different types of blood cell. The maxima in the histograms were shown to be invariant for each cell type and were robust to changes in the staining solution. The authors acknowledge that their study has yet to create a concrete quantitative criterion for identification, but point out that similar logic could be applied to other colour microscopic images, such as bone marrow smears, tissue cultures and others.
The following examples show the benefits for
diagnosis and non-invasive monitoring.
The importance of computerised imaging analysis cannot be overstated; benefits are to be gained not only in better diagnoses but also in the development of less invasive monitoring techniques. The following three examples illustrate these points. Each example details a current medical problem and suggests how pattern classifiers and image processing could provide solutions.
Neonatal jaundice is a major worry
monitored by blood test.
Jaundice is caused by an excess of the yellow bile pigment bilirubin. Neonatal jaundice occurs in 20-50% of the newborn population and is probably the most common neonatal symptom that causes professional and parental worry [KNU96]. High levels of bilirubin can lead to kernicterus: the staining and subsequent damaging of the brain. Bilirubin levels are monitored by blood sample. In premature babies each sample involves pricking the baby's heel to extract the blood. In the case of severe jaundice, samples must be taken frequently, often twice a day. Repeated blood sampling introduces risks of infection, anaemia and bruising, and causes pain and distress. Additionally there are the expenses of phlebotomy and biochemistry.
Skin colour is used as a guide to jaundice severity,
and it might indicate potential brain damage.
The severity of jaundice can also be assessed by the yellow skin colour if the skin is pressed. This is recommended only as a rough guide. Even so, there is evidence that the yellow colour of the skin should provide a better indicator of potential brain damage than does the blood bilirubin concentration [KNU95], KNU89].
Medical staff are advised to carry out skin colour assessment under natural or fluorescent lighting, and warned that the test can be misleading if light reflects off yellow or orange curtains or bedspreads [BAL93]. In dark-skinned babies the colour of the sclerea, palms and soles can be useful; although mixed ethnic standardisation curves have been suggested [LIN94]. An icterometer can be used. This is a piece of clear plastic that is pressed against the baby. The colour of the blanched skin is compared against five coloured strips on the plastic, which give a guide to the level of jaundice. The icterometer has been shown to be useful in reducing the need for blood samples, and may also be useful to peripheral staff in developing countries when deciding on referral to specialist centres [NAR90].
Vision systems might allow continuous jaundice monitoring.
Now, suppose we were able to record the colour of the blanched skin to monitor change; or compare pressed blanched skin with non-pressed skin, perhaps allowing us to account for variation in natural skin colour. A pen-sized device with shielded light-source, colour photoreceptors and learning mechanism is feasible. Could such a device provide a better guide to the level of jaundice? Moreover, could the same device be used for other conditions, such as monitoring anaemia from tissue pallor? Clearly there are confounding factors such as changes in epidermal blood flow. However, some instrumental methods for measuring skin pigmentation and skin colour change have been shown to be both sensitive and reliable [PIE98].
Non-invasive techniques could allow for more frequent or continuous monitoring. With over 133,000,000 babies born each year worldwide [WHO98], usage and sales potential could be huge.
Mammography is relatively reliable,
but costly mistakes happen.
Mammography is reported as the most reliable method for detecting lesions in the breast [NAS97]. Even so, interpretation of mammograms has been shown to be affected by factors such as view box luminance and masking [WAN98] and by ambient lighting [KIM97]. Contrast and definition can be degraded by under processing the mammographic film [SPR96]. This can be caused by physical factors (low developer temperature, inadequate development time, insufficient developer agitation) or chemical factors (developer not optimised for film type; over diluted, under replenished or contaminated developer).
The accuracy of mammography screening was recently studied in the USA [ELM98], and this study is probably indicative of other countries. Of 2,400 women screened over ten years, 24% had at least one false positive mammogram (a false positive is here defined as a suspicious or abnormal finding which after one year of follow-up failed to reveal a tumour). The estimated cumulative rate for a risk of a false positive result after 10 mammographic screens was 49%. A false positive error might at first sight seem trivial - better safe than sorry. However, this study showed that for every $100 spent on screening, an additional $33 was spent to evaluate false positive results. The false positives brought unnecessary outpatient visits, biopsies, ultrasound examinations, and anxiety. A recent Swedish study concluded that examinations and investigation carried out after false positive mammography were a neglected but substantial problem [LIB96].
Image processing could reduce errors,
and perhaps predict tumour type.
False negatives are also possible. According to Goergan et al [GOE97], reportedly 11%-25% of cancers are overlooked on the initial mammogram. They found that at particular risk of being missed are cancers with a density not substantially greater than the surrounding breast tissue. Given this, image-processing options, such as segmentation or contrast expansion, seem imperative.
Computed radiography has been shown to result in high-quality images of diagnostic value equivalent to conventional film mammograms, provided image processing parameters are carefully selected. It is worth noting also that texture analysis on digital images of surrounding tissue taken during biopsy, has been shown to be useful in predicting malignant versus benign outcomes [THI96].
Failures have occurred in cervical screening programmes.
Serious failures in cervical screening seem all too common. Two failures made the headlines during July 1998. At the Prince Charles Hospital, Merthyr Tydfil, inaccuracies in smear tests carried out between 1990 and 1994 were blamed for the deaths of two women and serious delays in the treatment of 12 more. While at St George's Hospital in south London, over 1,000 women had to be recalled after probable errors in colposcopy screening. There one woman died and 11 were diagnosed with cancer.
The most serious case of cervical smear test misreporting occurred at Kent and Canterbury Hospital from 1990 to 1995. More than 91,000 smears had to be re-examined in 1996. Five thousand five hundred and sixty-six women needed to be followed up, of whom 333 had moderate or severe abnormalities, including 30 who needed hysterectomies. Eight women died.
The National Audit Office has reported that only about half of the 181 laboratories in England meet recommended cervical screening quality standards. The NAO conclude that this means that many laboratories might be missing some abnormalities (false negatives) and many might be reporting abnormalities where none exist (false positives) [BOU98], p47].
Errors are encouraged by the low
prevalence of abnormalities.
The pathologists and cytologists who screen cervical smears are dealing with a low-prevalence population. In England during 1996/97 3.8 million women were screened, involving the processing of 4.4 million smears. Only 3,500 new cases of invasive cervical cancer were expected [BOU98].
The screener is searching for subtle variations in the shape, size and structure of cervical cells, but the vast majority of slides will be negative. A screener might view 50 slides a day in Britain, or up to 100 a day in the US. Each slide has 1,000 fields of view, so a screener in the US might manipulate the microscope 100,000 times a day [GAV98]. False negatives might be encouraged by the assimilation of each judgement to its predecessor [LAM95]. This is likely in screeners faced with a long sequence of similar stimuli.
The abnormal cells in false negative slides are likely to be:
Given that their characteristics are often different to genuinely positive cells, rapid re-screening of slides that are considered negative might be of limited effect [MIT95].
Smear testing is invaluable,
but advancements are needed.
Despite the scare stories it should not be forgotten that smear testing has been invaluable in reducing the incidence of cervical cancer over the past 50 years [STA97] and no test ever invented has been as successful in preventing cancer [DEM97]. However, advances are clearly needed.
PAPNET is a smear vision inspection system that
has proven benefits. But it is expensive and
requires specialist training and equipment.
One recent advance has been the development of PAPNET. PAPNET is a computerised screening aid that uses a combination of neural network image recognition, image processing and slide handling robotics to analyse conventional cervical smears [PAP97]. Two neural networks are used, both are feed-forward, back-propagation networks. One neural network is trained with grey-scale images of single cells; the other is trained with images of cell clusters. Participating laboratories ship slides to a PAPNET Slide Scanning Centre, where the image analysis takes place. A CD containing images and location references of potential abnormalities is then returned to the participating laboratory for conventional review.
Studies have shown that PAPNET reduces the rate of false negatives when used as an adjunct to conventional screening (e.g. [DEN97], [DOO97], [HAL97], [JEN97]). However the costs of PAPNET screening make it affordable to only some health sectors [MIC97]. PAPNET was approved in the US by the Food and Drug Administration for secondary screening in 1995. In the UK, where it is seen as too expensive for use as a secondary screener, PAPNET has been undergoing trials as a primary screener [GAV98]. But to be confidently used as a primary screener, PAPNET's accuracy must be shown to be beyond doubt.
PAPNET demonstrates that machine vision inspection can reduce errors on tasks such as cervical screening. But it is clear that more cheap and practicable automated screening systems are needed.
Page navigation:
top of page
© copyright 2000, Andrew Tomlinson: All rights reserved. No portion of this document may be reproduced, copied or revised without written permission of the author.