Development of Structured Data from Unstructured Homeopathic Case Sheet Documents Using Natural Language Processing Techniques

Authors

  • Savitha Shetty, Sarika Hegde, Saritha Shetty

Keywords:

homeopathic case sheets, information extraction, key value pairs, structured data

Abstract

Homeopathic physicians generally maintain the records of patients in the form of text documents, which contain valuable clinical information but lacks the standardized structure. Generation of structured data from unstructured homeopathic case sheet documents is a challenging process. The goal of the study is to identify and extract the key-value pairs from such unstructured documents to generate structured data that can be utilized for analysis and decision making in homeopathic practice. The paper outlines the methodology for collecting homeopathic case sheet documents, extracting relevant information and structuring it into a format suitable for further analysis. The methodology described in this study has been successfully implemented on a dataset comprising of 300 homeopathic case sheet documents. For each of these documents, our approach successfully extracted key-value pairs, representing various clinical parameters such as details of presenting complaints, family history, final diagnosis etc. The key-value pairs were structured and organized within excel spreadsheet, facilitating further analysis and interpretation of extracted clinical data. Evaluation shows that out of 300 documents three documents could not be properly converted to the structured form due to spacing constraints hence accuracy is 99%.

Downloads

Download data is not yet available.

References

Wilhelm, M., Hermann, C., Rief, W., Schedlowski, M., Bingel, U., & Winkler, A. (2024). Working with patients’ treatment expectations–what we can learn from homeopathy. Frontiers in Psychology, 15, 1398865.

Gupta, S. K., Basu, A., Nievas, M., Thomas, J., Wolfrath, N., Ramamurthi, A., ... & Singh, H. (2024). PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models. arXiv preprint arXiv:2404.15549.

Hsu, J. C., Wu, M., Kim, C., Vora, B., Lien, Y. T., Jindal, A., ... & Wu, B. (2024). Applications of advanced natural language processing for clinical pharmacology. Clinical Pharmacology & Therapeutics, 115(4), 786-794.

Gao, Y., Mahajan, D., Uzuner, Ö., & Yetisgen, M. (2024). Clinical natural language processing for secondary uses. Journal of Biomedical Informatics, 150, 104596.

de Oliveira, J. M., Antunes, R. S., & da Costa, C. A. (2024). SOAP classifier for free-text clinical notes with domain-specific pre-trained language models. Expert Systems with Applications, 245, 123046.

Sen, P. S., & Mukherjee, N. (2024). An ontology-based approach to designing a NoSQL database for semi-structured and unstructured health data. Cluster Computing, 27(1), 959-976.

Abedi, M., Hempel, L., Sadeghi, S., & Kirsten, T. (2022). GAN-based approaches for generating structured data in the medical domain. Applied Sciences, 12(14), 7075.

Sager, N. (1978). Natural language information formatting: the automatic conversion of texts to a structured data base. In Advances in computers (Vol. 17, pp. 89-162). Elsevier.

Sun, W., Cai, Z., Li, Y., Liu, F., Fang, S., & Wang, G. (2018). Data processing and text mining technologies on electronic medical records: a review. Journal of healthcare engineering, 2018.

Tange, H. J., Hasman, A., de Vries Robbé, P. F., & Schouten, H. C. (1997). Medical narratives in electronic medical records. International journal of medical informatics, 46(1), 7-29.

Downloads

Published

12.06.2024

How to Cite

Savitha Shetty. (2024). Development of Structured Data from Unstructured Homeopathic Case Sheet Documents Using Natural Language Processing Techniques. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 1698–1702. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6468

Issue

Section

Research Article