Automated skin biopsy histopathological image annotation using multi-instance representation and learning
- Gang Zhang^{1, 3},
- Jian Yin^{1},
- Ziping Li^{2},
- Xiangyang Su^{4},
- Guozheng Li^{2, 5} and
- Honglai Zhang^{6}Email author
https://doi.org/10.1186/1755-8794-6-S3-S10
© Zhang et al.; licensee BioMed Central Ltd. 2013
Published: 11 November 2013
Abstract
With digitisation and the development of computer-aided diagnosis, histopathological image analysis has attracted considerable interest in recent years. In this article, we address the problem of the automated annotation of skin biopsy images, a special type of histopathological image analysis. In contrast to previous well-studied methods in histopathology, we propose a novel annotation method based on a multi-instance learning framework. The proposed framework first represents each skin biopsy image as a multi-instance sample using a graph cutting method, decomposing the image to a set of visually disjoint regions. Then, we construct two classification models using multi-instance learning algorithms, among which one provides determinate results and the other calculates a posterior probability. We evaluate the proposed annotation framework using a real dataset containing 6691 skin biopsy images, with 15 properties as target annotation terms. The results indicate that the proposed method is effective and medically acceptable.
Keywords
Background
With the rapid development of computer-aided diagnosis, increasingly more digital data have been stored electronically. It has been a great challenge for doctors and experts to effectively analyse these data. Introducing the power of computational intelligence into this analysis problem would be meaningful and practical, with the potential not only to ease the burden of doctors but also to save time so that doctors and experts can pay more attention to confusing and difficult cases [1].
In skin disease diagnosis, histopathological data provide a microscopic view of skin tissue architecture, which contributes to the correct diagnosis of skin diseases. Microscopic analysis of skin tissue provides further information about what happens under the skin's surface. To confirm a skin disease, on the one hand, doctors should have a clear understanding of the patient's medical history and careful observations of the skin eruption. On the other hand, histopathological data are of great necessity. For example, different patients may appear to have the same rash; however, differences in their histopathological data can distinguish them and aid in diagnosis. Histopathological data provide a comprehensive view of the presence of disease and its effects on patients. Some skin diseases, especially benign skin tumours and skin cancer, should be diagnosed using histopathological information. The information we extract from the data can help a doctor judge a patient's condition, estimate the prognosis, direct treatment, and evaluate the curative effects of treatments. For undiagnosed disease, complete histopathological data can provide an initial assessment of a condition's nature and severity.
Generally, there are two levels of skin disease diagnosis: skin surface inspection [2] and skin biopsy image analysis [3]. The former is a diagnostic procedure that can roughly be reached after routine exams, including observation and the physical examination of skin lesions, whereas the latter is a complement of the former [4, 5], utilised in cases where the doctor has less confidence or even cannot make a decision based only on an inspection of the skin surface. As indicated in histopathological studies, skin biopsy images reveal further information about what happens beneath the skin's surface at a microscopic level [4, 6]. Therefore, the results of skin biopsy image analysis could be explained more accurately than observations of the surface. For a medically acceptable diagnosis, many skin biopsy image cases are usually required to identify the significant changes associated with that specific diagnosis and differentiate them from those of similar skin diseases [7]. Because understanding skin biopsy images requires more professional knowledge and richer experience [8] than inspecting the skin's surface, it becomes a great challenge for doctors to correctly interpret huge number of skin biopsy images.
Currently, several attempts to undertake the automated histopathological image analysis problem have been reported. Metin N. et al. [1] reviewed some important work on histopathological data analysis. They reviewed studies on different information source processing, segmentation and feature extraction methods for different application backgrounds and model training algorithms. Syed et al. [9] presented an analysis of feature extraction methods for bag-of-features representations of histopathological images. Juan C. Caicedo et al. [10] proposed a histopathological image classification method based on bag-of-features and a kernel-function-based model training algorithm. They approached the skin cancer histopathology image classification problem by representing images through bag-of-feature methods. However, they solved the problem as a traditional single instance learning problem [11] with a kernel machine. Though widely used in histopathological image feature extraction, bag-of-features don't, in fact, reveal the inner structures of histopathological images, and most important, it loses original information to some extent [12].
Much of the work in skin image recognition has been reported publicly. We review two important works closely related to our work here. Bunte et al. [13] proposed a novel machine learning method for skin surface image classification. They noticed that existing skin surface image feature extraction methods are only differently weighted strategies of color space. Hence, if an optimal weighted strategy is learned from the training dataset, it can achieve very good performance. In their work, an optimal weights vector is learned through a maximal margin classification algorithm, realising the idea that instead of finding a proper weighting, they derived one. However, their method is not suitable for our task. On the one hand, in their work, manual labelling of normal and lesion regions is required for each skin surface image. Because understanding a skin biopsy image requires more skill and expertise than understanding a skin surface image, this requirement would be a heavy burden for doctors. On the other hand, in the work of Bunte et al., only RGB colour space-based features are used, which cannot fully describe the essential features of biopsy images, e.g., texture, local structures and even visual edges. Moreover, biopsy images are often stained for clearer illustration of tissue structures and different types of cells, which would lead to the failure of purely colour-based feature extraction methods.
Another work that should be emphasised is on Drosophila gene image annotation, proposed by Li et al. [12]. They addressed the problem of the automated annotation of Drosophila embryogene expression patterns in a multi-instance multi-label learning (MIML) framework [14]. Annotation terms are associated with groups of images corresponding to different embryogene developmental stages, but more specifically, the terms are in fact associated with some patches within the group of images. They solve the problem by regarding each image group as a multi-instance sample and annotated terms as labels attached to the sample. They proposed two MIML algorithms for model training. To express a group of images as a bag, they adopt a block division method to generate equal-size patches as instances. Though the general framework of [12] is consistent with our task, it is not naturally suited to skin biopsy image annotation, as Drosophila embryogene images do not contain complex inner structures, textures or colours. Therefore, equal-size block division does not make sense for our task.
In this article, we propose a novel automated annotation framework based on the theory of multi-instance learning. Multi-instance learning is a special learning framework introduced by Dietterich et al. [15] to solve the drug activity prediction problem. Different from single-instance learning, samples in multi-instance learning (also called bags) are composed of several instances with potential concept labels, only the concept labels of bags are known. For binary classification tasks, a bag is positive if and only if it contains at least one positive instance and negative otherwise. The task of multi-instance learning is to predict the labels of unseen bags by training a model with labelled bags.
We first show that the skin biopsy image annotation task can naturally be decomposed into several binary multi-instance classification tasks. Then, by applying a graph-cutting algorithm and region-based feature extraction methods, we propose an effective method of expressing each skin biopsy image as a bag whose instances are regions. Finally, we propose two algorithms for model building. One is discriminative and produces a binary output indicating whether a given image should be annotated with a certain term. The other one models the conditional distribution p(t_{ i }|I, D) to calculate the posterior probability of annotating an image I with a term ti, given a training dataset D.
Methods
- 1.
Multi-instance sample representation
- 2.
Feature extraction
- 3.
Training of learning algorithms
Formulation
The proposed annotation framework is motivated by the nature of skin biopsy image recognition, which can be naturally expressed as a multi-instance learning problem. To make this intuition clearer, it is necessary to review the procedure of manually annotating skin biopsy images. From dermatopathological clinical experience, we can see that a set of standard terms are used by doctors to annotate an image. However, doctors are not required to explicitly record the correspondence between standard terms and regions within a given image, leading to the terms ambiguity described in the previous section. Because terms are actually associated with certain local regions, it is not reasonable to connect each region of an image to all associated terms, which results in poor models from a machine learning perspective [16]. As illustrated in Figures 2.(a)-(d), regions within a given image may have different relationships to the attached terms. It is time-consuming to manually label each region with a set of terms to meet the requirement of traditional single-instance learning. For this reason, by regarding each image as a bag and regions within the image as instances, multi-instance learning is naturally suitable for the annotation task. According to the basic assumption of multi-instance learning [15], a bag can be annotated with a term if it contains at least one region labelled with that term. Otherwise, the bag cannot be annotated with that term. Thus, we can build a set of binary multi-instance classifiers, each of which corresponds to a term. Given an image, each classifier outputs a Boolean value indicating whether its term should be annotated to the image. Thereby, we can address the term ambiguity within a multi-instance learning framework.
Another challenge is how to effectively represent an image as a multi-instance sample, or a bag. The key problem is how to partition an image into several regions to construct instances. Skin tissue is microscopically composed of several different structures, and a doctor needs to inspect them individually to determine abnormal areas. Regions of a skin biopsy image should be divided according to the structures of skin tissue to come up with a feature description for each part, but clustering-based algorithms [17] may not generate contiguous regions. Hence, we apply an image-cutting algorithm, namely Normalized Cut (NCut) [18], to generate visually disjoint local regions. Prior knowledge in dermatopathology suggests that on the one hand, examining an individual visually disjoint region is sufficient to annotate it in most cases, and on the other hand, there is not considerable relationship between terms to be annotated in a given image. The former supports the application of our image-cutting method, and the latter allows us to decompose the annotation task in to a set of multi-instance binary classification tasks.
Formally, let D = {(I_{ i }, T_{ i })|i = 1, ..., n, I_{ i } ∈ I, T_{ i } ⊆T} be a set of skin biopsy images associated with a set of annotated terms, where T = {t_{1}, t_{2}, ..., t_{ m }} is a set of standard terms for annotation and I is a set of images. Each image is stored as a pixel matrix in 24k RGB colour space. The task is to learn a function f : I → 2^{ T } given D. When given an unseen image I_{ x }, f can output a subset of T corresponding to the annotation terms of the given image I_{ x }.
We first apply a cutting algorithm to generate visually disjoint regions for each image, given by I_{ i } = {I_{ ij }|j = 1, ..., n_{ i }}, where n_{ i } is the number of regions in image I_{ i }, followed by a feature extraction procedure to express each generated region as a feature vector. Then, we train the target model through two algorithms.
Skin biopsy image representation
Now we present a method for representing a skin biopsy image. First, express each image as a bag of regions as instances, and then apply two transformation-invariant feature extraction methods to further express them as vectors.
Multi-instance sample representation
To generate visually disjoint regions, we adopt a famous graph-cutting algorithm, Normalized Cut (NCut), proposed by Shi et al. [18] in 2000, aimed at extracting perceptual groupings from a given image. In constract with clustering-based image segmentation algorithms, e.g., [17], NCut extracts the global impression of a given image, i.e., disjoint visual grouping. To make this article self-contained, we briefly present the main idea of NCut.
According to [18], the solution to Eq. 2 captures a visual segmentation of an image whose underlying idea is naturally consistent with the clinical experience of skin biopsy image recognition. Eq. 2 can be solved as a standard Rayleigh quotient [19]. We ignore the detailed procedure for brevity. The computational time complexity of NCut for a given image is O(n^{2}), where n is the number of pixels in an image.
Feature extraction based on 2D-DWT
- 1.
Input a local region IR generated by NCut. Note that regions generated by NCut are irregular. For convenience, we store them as minimum covering rectangles by padding the regions with black pixels, as indicated in Figure 5. This padding does not significantly affect model performance, as most of these padding pixels will be discarded in later steps.
- 2.
Colour space transformation. IR is an RGB expression and now transferred to LUV space, denoted as IR_LUV. Calculate features f_{1} = mean(IR_LUV.L), f_{2} = mean(IR_LUV.U) and f_{3} = mean(IR_LUV.V).
- 3.
Divide IR_LUV into squares of size m × m pixels, resulting in (width/m) × (height/m) blocks, denoted as Bpq, where p = {1, ..., width/m} and q = {1, ..., height/m}. Eliminate blocks that are totally black, so as to remove padding pixels as much as possible.
- 4.
Apply 2D-DWT to each B_{ pq }, and keep coefficients LH, HL and HH. Let ${t}_{x}=\sqrt{\frac{1}{4}{x}^{T}x)}$, where x ∈ {LH, HL, HH}. Average t_{ x } for all blocks within a region to obtain features f_{4}, f_{5}, f_{6}.
- 5.
Following [20], calculate the normalized inertia of order 1, 2 and 3 as features f_{7}, f_{8}, f_{9}.
After the above 5 steps, a 9-ary real vector is obtained for each region. An image is transformed into a set of disjoint regions, represented as real feature vectors. Thus we turn the original dataset into a multi-instance representation. Note that this representation is invariant to transformation, as 2D-DWT extracts texture features of regions that are irrelevant to rotation angle and magnification. The other features, LUV mean and normalized inertia of orders 1, 2 and 3, are also transformation-invariant. In the following section, we will provide an in-depth discussion of the effectiveness of this feature extraction method.
Feature extraction based on SIFT
Scale-invariant feature transform (SIFT) [21] is a well-studied feature extraction method widely used in the study of medical image classification. Juan C. Caicedo et al. [10] used SIFT to extract histopathological image features. We apply SIFT as our second feature extraction strategy. Unlike 2D-DWT, SIFT has been proven to be a robust key point selector in different image annotation and analysis applications. We use the common setting of SIFT, in which 8 orientations and 4 × 4 blocks are used, resulting in a 128-ary vectorial expression. Intuitively speaking, SIFT selects several outstanding points to represent a given image. We apply SIFT to the NCut-generated regions to obtain a features vector.
Model training
We propose two multi-instance learning algorithms to train our model. The first algorithm is based on Citation-KNN [22], and the second is a Bayesian multi-instance learning algorithm, namely Gaussian Process Multi-Instance Learning (GPMIL) [23]. Citation-KNN was first proposed by Jun Wang et al. [22] and can be regarded as a multi-instance version of traditional KNN classifiers. To determine a given test bag's label, Citation-KNN considers not only the K nearest labelled bags, i.e., references, but also labelled bags that regard the given bag as a K nearest neighbour, i.e., citers. Citation-KNN is well studied and has many successful applications in machine learning. GPMIL introduced a Gaussian process prior and solved the multi-instance learning problem in a Bayesian learning framework. The essential idea of GPMIL is that by defining a set of latent variables and the likelihood function, it establishes the relationship between class labels and instances in a probabilistic framework. By imposing a Gaussian process prior on these latent variables, we can use a Bayesian learning strategy to derive a posterior distribution of annotation terms given a training dataset and a test image.
We extend these two algorithms to meet the requirements of our annotation task, taking into consideration some insights into skin biopsy image annotation. On the one hand, because there is no prior knowledge on which to base multi-instance learning assumptions [24] for our task, we build model from the original assumption [15]. Citation-KNN with a properly defined similarity metric is a simple but effective algorithm in this case. On the other hand, the confidence level of a term to be annotated to a given image is preferred, which requires us to model the predictive distribution of annotation terms. To achieve this goal, we extend Bayesian learning to the multi-instance setting and model the posterior distribution of the annotation terms. An additional benefit of the Bayesian learning framework is that it is possible to model correlation between annotation terms, leading to a more general model.
Citation-KNN for annotation
Citation-KNN is a multi-instance learning algorithm inspired by the citation and reference system in scientific literature. To determine the label of a test bag X, it considers not only the neighbours (references) of X but also the bags (citers) that regard X as a neighbour. Citation-KNN uses both references and citers to determine an unseen bag's concept label. The key problem is how to evaluate distances between bags to identify references and citers.
where AHD measures the average Hausdorff distance between two bags A and B, and a, b are instances in each bag. d(x, y) is the Euclidean distance function in instance space. As indicated in [25], AHD achieves a better performance than other set distance functions in multi-instance learning. The intuitive definition of AHD is the average minimal distance between instances from two bags, which better evaluates the spatial relationship between a pair of bags.
- 1.
Cluster the training set D to obtain s clusters and denote the centroid of each cluster as c_{ i }, s = {i = 1, ..., s}.
- 2.
Compute the AHD distance between each training sample and each centroid s_{ i }, and keep the K nearest training samples for each s_{ i } in the ith row of LM.
Thus we obtain a s-by-K locality matrix LM. When testing an image, we first calculate the distance between centroids and the given image, then discard the centroids that are far from the given image. For the remaining centroids, we perform a table lookup on LM to find the corresponding rows of the remaining centroids; only the training samples associated with such rows are needed in distance computation. We can prune out a large portion of the training samples that are far away from the test image, which greatly reduces the computational cost. The matrix can be computed only once before testing with cost O(n^{2}), where n = |D| stands for the size of the training set.
GPMIL
where in the right hand side of Eq. (9), p(t|G_{ X }, X) represents the likelihood function of the test mage X, given by $p\left(t|{G}_{X},X\right)=\int p\left({G}_{X}|{G}_{D},D,X\right)p\left({G}_{D}|D,Y\right)d{G}_{D}$, and p(G_{ X }|D, T, X) represents the posterior distribution of latent variable G_{ X }. For each test image X, using the whole training dataset and the corresponding annotation vector T, we can obtain a predictive distribution that is a function of X and t. The effective method for solving Eq. (9) can be found in [27, 23].
- 1.
Suppose we have a training image set D associated with a binary annotation vector for term t and a test image X.
- 2.
- 3.
Following Eqs. (7), (8) and (9), we write down the analytical form of the predictive distribution for X.
- 4.
We use some approximate method to transform the predictive distribution to a Gaussian distribution that can be solved analytically. After this step, a close-form solution can be obtained for testing any unseen images. In other words, the training set can be discarded in the testing step.
For each annotation term t, a model is trained by using GPMIL. For a test image, each model calculates a probability indicating the confidence of annotating the image with the corresponding term.
Evaluation
Dataset description
15 annotation terms with occurence rates
No. | Name | Rate |
---|---|---|
t1 | hyperkeratosis | 28.65% |
t2 | parakeratosis | 22.71% |
t3 | absent granular cell layer | 1.8% |
t4 | acanthosis | 32.15% |
t5 | thin prickle cell layer | 4.14% |
t6 | hyperpigmentation of Basal cell layer | 6.48% |
t7 | Munro microabscess | 2.61% |
t8 | nevocytic nests | 9.12% |
t9 | infiltration of lymphocytes | 36.99% |
t10 | basal cell liquefaction degeneration | 4.46% |
t11 | horn cyst | 6.31% |
t12 | hypergranulosis | 8.25% |
t13 | follicular plug | 3.72% |
t14 | papillomatosis | 16.48% |
t15 | retraction space | 4.53% |
A binary matrix is obtained by text matching, in which each row is a 15-ary binary vector indicating whether an image has been annotated with these terms. Based on domain knowledge, a skin biopsy image is possibly composed of up to 15 regions. We set the number of regions p as 8, 10 or 12 for separate runs of our proposed algorithm, then combine them through majority voting. Images fed to NCut are all rescaled to 200 × 150 pixels for effective calculation. The feature extraction methods were applied to the rescaled images instead of the original ones because the rescaled images contain sufficient information.
Evaluation criteria
As mentioned in the previous section, we decomposed the annotation task into several binary classification tasks. Zero-one loss (also called precision) is a straightforward criterion for our task. Because multiple terms are associated with an image, multi-label machine learning evaluation criteria are also suitable for our task. We also introduce Hamming loss for evaluation, whose definition can be found in [28]. Intuitively speaking, Hamming loss is a measure of how many object-term pairs are annotated by mistake. Note that larger values of Hamming loss indicate better model performance. Zero-one loss evaluates the annotation performance of a single term, whereas Hamming loss evaluates the whole model output for all terms.
Evaluation results
Evaluation of feature extraction and model training methods
Precisions of different models
Term | Citation | GPMIL | BOF | ||
---|---|---|---|---|---|
2D-DWT | SIFT | 2D-DWT | SIFT | ||
t1 | 63.24% | 59.06% | 65.14% | 64.55% | 58.05% |
t2 | 66.45% | 67.12% | 67.56% | 67.63% | 64.34% |
t3 | 69.54% | 66.47% | 70.40% | 68.29% | 57.93% |
t4 | 73.88% | 70.85% | 77.78% | 72.23% | 74.55% |
t5 | 59.12% | 60.21% | 62.12% | 58.23% | 56.71% |
t6 | 63.41% | 63.00% | 65.12% | 66.02% | 58.55% |
t7 | 69.42% | 71.23% | 71.98% | 70.24% | 76.60% |
t8 | 70.04% | 66.73% | 73.12% | 69.44% | 62.86% |
t9 | 78.19% | 79.11% | 75.00% | 81.49% | 76.82% |
t10 | 72.42% | 68.48% | 71.34% | 69.49% | 64.03% |
t11 | 81.42% | 80.91% | 85.12% | 83.23% | 81.95% |
t12 | 75.00% | 74.83% | 73.52% | 78.56% | 74.82% |
t13 | 80.12% | 78.02% | 83.13% | 80.04% | 77.85% |
t14 | 84.21% | 82.35% | 82.34% | 83.12% | 80.48% |
t15 | 81.23% | 80.34% | 83.55% | 85.90% | 79.22% |
In Table 2 the column BOF stands for the result of the bag-of-features method proposed in [10]. The best result in each row has been highlighted in bold. It can be observed that the multi-instance learning-based methods are superior to the bag-of-features-based method for annotating most terms. Both feature extraction methods achieved the best performance in some cases. We cannot simply determine which method is superior to the other. Some prior knowledge or experience can be introduced to determine the most suitable feature representation method. Another factor that should be noted is the stability of the proposed method, which achieves higher precision but lower variance compared to the baseline method, meaning that the proposed method is more reliable and stable for the annotation of different terms.
Hamming loss of different models
Citation | GPMIL | BOF | ||
---|---|---|---|---|
2D-DWT | SIFT | 2D-DWT | SIFT | |
31.24% | 29.56% | 26.54% | 27.02% | 35.03% |
The impact of number of regions
Ensemble results of different numbers of regions
Citation | GPMIL | BOF | ||
---|---|---|---|---|
2D-DWT | SIFT | 2D-DWT | SIFT | |
31.24% | 29.56% | 26.54% | 27.02% | 35.03% |
The impact of an imbalanced training set
An illustration of the model output
Discussion
Multi-instance representation vs. bag-of-features
In histopathological and dermatopathological image analysis, a large amount of work was based on bag-of-features construction [10, 29–31], in which a dictionary is built whose elements are small patches from a set of training images and can be regarded as keywords. To classify or annotate a given image, these methods need only examine the presence or quantity of keywords in the image. Thus the image can be expressed as a histogram of elements in the dictionary.
Our multi-instance framework is quite different from bag-of-features-based methods. The proposed framework retains original features through direct feature extraction methods, whereas bag-of-features-based methods only generate some statistical measures, e.g., histogram of the elements in a dictionary, which may cause some loss of discriminative information. Meanwhile, the elements of a dictionary in a bag-of-features-based method are often derived from grid-based image patches. We argue that such patches are not able to fully capture the essential discriminative information contained in histopathological images. The proposed framework generates meaningful local regions with visually disjoint edges using NCut, which is more consistent with diagnostic experience in dermatopathology.
Number of regions of Normalized Cut
We addressed some issues related to setting a reasonable number of regions. Though the evaluation results showed that an ensemble with different regions yields an acceptable result, this method lacks a good explanation. When inspecting skin biopsy images, a small number of regions indicates that the doctor is focusing on relatively global features, whereas a large number indicates more detailed features. Doctors' behaviour may range from global to detailed according to their knowledge and experience. Skin tissue is composed of three anatomically distinct layers, namely the epidermis, dermis, and subcutaneous tissue (fat). Epidermis can be further divided into four layers. Each layer has a distinctive stained colour and special structures. Distinct pathological changes involving any of these whole layers such as Hyperkeratosis, Acanthosis and Hyperpigmentation of the basal cell layer, can be easily recognised in a small number of segmentations. Specific changes within a layer, such as a Munro microabscess, nevocytic nests or infiltration of lymphocytes, can be more accurately detected when the image is divided into more pieces. Either a global or a detailed view is reasonable in diagnosis, which is consistent with the above evaluation results.
Relationship between regions
Considering the relationships between regions, it should be noted that skin tissues have clearly featured inner structures. Some correlation can be observed between the presence of different terms within an image. For example, terms such as hyperkeratosis and parakeratosis can only be found in certain regions and above features such as acanthosis or hyperpigmentation of the basal cell layer (if the term is attached to the same image). Theoretically speaking, GPMIL can capture such correlations to some extent by defining a different likelihood function [27]. Our Gaussian process prior for GPMIL also implies such relationships. However, previous work [32] reported that the inclusion of such relationships did not make a positive contribution to model performance. We owe this phenomenon to the doctors' experience implied in the training dataset, i.e., that doctors or experts pay more attention to important local regions, which statistically reduces the emphasis on relationships between regions.
Conclusion
In this work, we introduce the application of multi-instance representation and learning to the recognisation and annotation of dermatopathological skin biopsy images. To reprensent a skin biopsy image as a multi-instance sample, we apply Normalized Cut to divide an image into visually disjoint regions and then extract features for each region through 2D-DWT and SIFT-based algorithms. Two training algorithms have been proposed for model building: Citation KNN provides a binary output, and GPMIL calculates a probability indicating the confidence level of the model output. The evaluation results show that the proposed method is effective for biopsy image recognition and annotation.
Medically, the results contribute to the development of dermatopathology. Time-consumption and expenditure would be lower if a computer program could take over the annotation work of a pathologist. The accuracy of diagnosis would be increased if subjective factors, such as a doctor's skill, and objective factors, such as light, were eliminated. The application accords with developing trends in dermatopathology. Further work will include introducing relationships between terms in multi-instance multi-label framework and designing more powerful region recognition and feature extraction methods.
Declarations
Acknowledgements
Based on "Multi-Instance Learning for Skin Biopsy Image Features Recognition", by Gang Zhang, Xiangyang Shu, Zhaohui Liang, Yunting Liang, Shuyi Chen and Jian Yin which appeared in Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on. ^{©} 2012 IEEE 978-1-4673-2560-8/12/.
The authors would like to thank Prof. Dacan Chen and Prof. Zhaohui Liang from The Second Affiliated Hospital of Guangzhou University of Chinese Medicine for their inspiring suggestions, assistance and financial aid during the study. This work is supported by the National Natural Science Foundation of China (No. 81274003, 61033010, 61272065), Guangdong Provincial Foundation of Medical Science Research (No. A2012215), Natural Science Foundation of Guangdong Province (S2011020001182), Research Foundation of Science and Technology Plan Project in Guangdong Province and Guangzhou City (2009B030801090, 2010A040303004, 11A12050914, 11A31090341, 2011Y5-00004), Research Foundation of Guangdong Provincial Hospital of Chinese Medicine (No. 2013KT1067), Research Foundation of Sysung-Etri project (2011A091000026) and the 2012 College Student Career and Innovation Training Plan Project (1184512043).
Declarations
The publication costs for this article were funded by the corresponding author.
This article has been published as part of BMC Medical Genomics Volume 6 Supplement 3, 2013: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2012: Medical Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcmedgenomics/supplements/6/S3.
Authors’ Affiliations
References
- Gurcan MN, Boucheron LE, Can A, Madabhushi A, Rajpoot NM, Yener B: Histopathological Image Analysis: A Review. Biomedical Engineering, IEEE Reviews in. 2009, 2: 147-171.View ArticleGoogle Scholar
- Cerroni L, Argenyi Z, Cerio R, Facchetti F, Kittler H, Kutzner H, Requena L, Sangueza OP, Smoller B, Wechsler J, Kerl H: Influence of evaluation of clinical pictures on the histopathologic diagnosis of inflammatory skin disorders. J Am Acad Dermatol. 2010, 63 (4): 647-52. 10.1016/j.jaad.2009.09.009.View ArticlePubMedGoogle Scholar
- Fogelberg A, Ioffreda M, Helm KF: The utility of digital clinical photographs in dermatopathology. J Cutan Med Surg. 2004, 8 (2): 116-21.View ArticlePubMedGoogle Scholar
- Llamas-Velasco M, Paredes BE: Basic concepts in skin biopsy. Part I. Actas Dermosifiliogr. 2012, 103: 12-20. 10.1016/j.ad.2011.05.007.View ArticleGoogle Scholar
- Ferrara G, Argenyi Z, Argenziano G, Cerio R, Cerroni L, Di Blasi A, Feudale EA, Giorgio CM, Massone C, Nappi O, Tomasini C, Urso C, Zalaudek I, Kittler H, Soyer HP: The influence of clinical information in the histopathologic diagnosis of melanocytic skin neoplasms. PLoS One. 2009, 4 (4): e5375-10.1371/journal.pone.0005375.PubMed CentralView ArticlePubMedGoogle Scholar
- Neitzel CD: Biopsy techniques for skin disease and skin cancer. Oral Maxillofac Surg Clin North Am. 2005, 17 (2): 143-6. 10.1016/j.coms.2005.02.002. vView ArticlePubMedGoogle Scholar
- Grayson W: Recognition of Dual or Multiple Pathology in Skin Biopsies from Patients with HIV/AIDS. Patholog Res Int. 2011, 2011: 398546-PubMed CentralPubMedGoogle Scholar
- Sellheyer K, Bergfeld WF: A retrospective biopsy study of the clinical diagnostic accuracy of common skin diseases by different specialties compared with dermatology. J Am Acad Dermatol. 2005, 52 (5): 823-30. 10.1016/j.jaad.2004.11.072.View ArticlePubMedGoogle Scholar
- Raza SH, Parry RM, Moffitt RA, Young AN, Wang MD: An Analysis of Scale and Rotation Invariance in the Bag-of-Features Method for Histopathological Image Classification. MICCAI (3), Volume 6893 of Lecture Notes in Computer Science. Edited by: Fichtinger G, Martel AL, Peters TM. 2011, Springer, 66-74.Google Scholar
- Caicedo JC, Cruz-Roa A, González FA: Histopathology Image Classification Using Bag of Features and Kernel Functions. AIME, Volume 5651 of Lecture Notes in Computer Science. Edited by: Combi C, Shahar Y, Abu-Hanna A. 2009, 126-135.Google Scholar
- Bishop CM: Pattern Recognition and Machine Learning (Information Science and Statistics). Secaucus, NJ. 2006, USA: Springer-Verlag New York, IncGoogle Scholar
- Li YX, Ji S, Kumar S, Ye J, Zhou ZH: Drosophila Gene Expression Pattern Annotation through Multi-Instance Multi-Label Learning. IEEE/ACM Trans Comput Biology Bioinform. 2012, 9: 98-112.View ArticleGoogle Scholar
- Bunte K, Biehl M, Jonkman MF, Petkov N: Learning effective color features for content based image retrieval in dermatology. Pattern Recogn. 2011, 44 (9): 1892-1902. 10.1016/j.patcog.2010.10.024.View ArticleGoogle Scholar
- Zhou ZH, Zhang ML, Huang SJ, Li YF: Multi-instance multi-label learning. Artif Intell. 2012, 176: 2291-2320. 10.1016/j.artint.2011.10.002.View ArticleGoogle Scholar
- Dietterich TG, Lathrop RH, Lozano-Pérez T: Solving the multiple instance problem with axis-parallel rectangles. Artif Intell. 1997, 89 (12): 31-71.View ArticleGoogle Scholar
- Zhang ML: Generalized Multi-Instance Learning: Problems, Algorithms and Data Sets. 2009Google Scholar
- Chen Y, Wang JZ: Image Categorization by Learning and Reasoning with Regions. J Mach Learn Res. 2004, 5: 913-939.Google Scholar
- Shi J, Malik J: Normalized Cuts and Image Segmentation. IEEE Trans Pattern Anal Mach Intell. 2000, 22 (8): 888-905. 10.1109/34.868688.View ArticleGoogle Scholar
- Golub GH, Van Loan CF: Johns Hopkins series in the mathematical sciences. Matrix computations. 1989, Baltimore: Johns Hopkins University Press, 2Google Scholar
- Gersho A: Asymptotically optimal block quantization. Information Theory, IEEE Transactions on. 1979, 25 (4): 373-380. 10.1109/TIT.1979.1056067.View ArticleGoogle Scholar
- Lowe DG: Distinctive Image Features from Scale-Invariant Keypoints. Int J Comput Vision. 2004, 60 (2): 91-110.View ArticleGoogle Scholar
- Wang J, Zucker JD: Solving Multiple-Instance Problem: A Lazy Learning Approach. 2000Google Scholar
- Kim M, la Torre FD: Gaussian Processes Multiple Instance Learning. Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel. Edited by: F´'urnkranz J, Joachims T. 2010, Omnipress, 535-542.Google Scholar
- Foulds J, Frank E: A Review of Multi-Instance Learning Assumptions. Knowl Eng Rev. 2010, 25: 1-25. 10.1017/S026988890999035X.View ArticleGoogle Scholar
- Zhang ML, Zhou ZH: Multi-instance clustering with applications to multi-instance prediction. Applied Intelligence. 2009, 31: 47-68. 10.1007/s10489-007-0111-x.View ArticleGoogle Scholar
- Rasmussen CE, Williams C: Gaussian Processes for Machine Learning. 2006, MIT PressGoogle Scholar
- He J, Gu H, Wang Z: Bayesian multi-instance multi-label learning using Gaussian process prior. Mach Learn. 2012, 88 (12): 273-295.View ArticleGoogle Scholar
- Zhang ML, Wang ZJ: MIMLRBF: RBF neural networks for multi-instance multi-label learning. Neurocomputing. 2009, 3951-3956.Google Scholar
- Ji S, Li YX, Zhou ZH, Kumar S, Ye J: A bag-of-words approach for Drosophila gene expression pattern annotation. BMC Bioinformatics. 2009, 10: 119-10.1186/1471-2105-10-119.PubMed CentralView ArticlePubMedGoogle Scholar
- Cruz-Roa A, Caicedo JC, González FA: Visual Pattern Analysis in Histopathology Images Using Bag of Features. CIARP, Volume 5856 of Lecture Notes in Computer Science. Edited by: Bayro-Corrochano E, Eklundh JO. 2009, Springer, 521-528.Google Scholar
- Rueda A, Arevalo JE, Cruz-Roa A, Romero E, González FA: Bag of Features for Automatic Classification of Alzheimer's Disease in Magnetic Resonance Images. CIARP, Volume 7441 of Lecture Notes in Computer Science. Edited by: Álvarez L, Mejail M, Gómez L, Jacobo JC. 2012, Springer, 559-566.Google Scholar
- Zhang G, Shu X, Liang Z, Liang Y, Chen S, Yin J: Multi-instance learning for skin biopsy image features recognition. Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on . 2012, 1-6. 10.1109/BIBM.2012.6392648.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.