Mining similar radiology reports using BoW and Fuzzy C-means clustering

Türkeli S. , Gazioglu B. S. A. , Kurt K. K. , Atay H. T. , Gorur Y.

2017 International Artificial Intelligence and Data Processing Symposium, IDAP 2017, Malatya, Turkey, 16 - 17 September 2017 identifier


© 2017 IEEE.Finding similar diagnoses for the same region are vital for patients. In this paper, we aim to find the similarity radiology reports based on bag-of-words (BoW) and Fuzzy C-Means Clustering methods. A double-layer structure is applied. Firstly, extracting features from data BoW method is applied and then Fuzzy C-Means algorithm is performed to cluster the blocks into the similar cluster and the non-similar cluster. 457 radiology reports were examined which were collected from a research and education hospital in Istanbul. Data were tested according to the 23 regions and 137 diagnosis. By the opinion of the radiologist a vocabulary consists of these regions and diagnosis were created. Experimental results on data sets have shown that for the standard documents BoW and Fuzzy C-Means Clustering can be used to find similarity.