Development of Structured Data from Unstructured Homeopathic Case Sheet Documents Using Natural Language Processing Techniques
Keywords:
homeopathic case sheets, information extraction, key value pairs, structured dataAbstract
Homeopathic physicians generally maintain the records of patients in the form of text documents, which contain valuable clinical information but lacks the standardized structure. Generation of structured data from unstructured homeopathic case sheet documents is a challenging process. The goal of the study is to identify and extract the key-value pairs from such unstructured documents to generate structured data that can be utilized for analysis and decision making in homeopathic practice. The paper outlines the methodology for collecting homeopathic case sheet documents, extracting relevant information and structuring it into a format suitable for further analysis. The methodology described in this study has been successfully implemented on a dataset comprising of 300 homeopathic case sheet documents. For each of these documents, our approach successfully extracted key-value pairs, representing various clinical parameters such as details of presenting complaints, family history, final diagnosis etc. The key-value pairs were structured and organized within excel spreadsheet, facilitating further analysis and interpretation of extracted clinical data. Evaluation shows that out of 300 documents three documents could not be properly converted to the structured form due to spacing constraints hence accuracy is 99%.
Downloads
References
Wilhelm, M., Hermann, C., Rief, W., Schedlowski, M., Bingel, U., & Winkler, A. (2024). Working with patients’ treatment expectations–what we can learn from homeopathy. Frontiers in Psychology, 15, 1398865.
Gupta, S. K., Basu, A., Nievas, M., Thomas, J., Wolfrath, N., Ramamurthi, A., ... & Singh, H. (2024). PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models. arXiv preprint arXiv:2404.15549.
Hsu, J. C., Wu, M., Kim, C., Vora, B., Lien, Y. T., Jindal, A., ... & Wu, B. (2024). Applications of advanced natural language processing for clinical pharmacology. Clinical Pharmacology & Therapeutics, 115(4), 786-794.
Gao, Y., Mahajan, D., Uzuner, Ö., & Yetisgen, M. (2024). Clinical natural language processing for secondary uses. Journal of Biomedical Informatics, 150, 104596.
de Oliveira, J. M., Antunes, R. S., & da Costa, C. A. (2024). SOAP classifier for free-text clinical notes with domain-specific pre-trained language models. Expert Systems with Applications, 245, 123046.
Sen, P. S., & Mukherjee, N. (2024). An ontology-based approach to designing a NoSQL database for semi-structured and unstructured health data. Cluster Computing, 27(1), 959-976.
Abedi, M., Hempel, L., Sadeghi, S., & Kirsten, T. (2022). GAN-based approaches for generating structured data in the medical domain. Applied Sciences, 12(14), 7075.
Sager, N. (1978). Natural language information formatting: the automatic conversion of texts to a structured data base. In Advances in computers (Vol. 17, pp. 89-162). Elsevier.
Sun, W., Cai, Z., Li, Y., Liu, F., Fang, S., & Wang, G. (2018). Data processing and text mining technologies on electronic medical records: a review. Journal of healthcare engineering, 2018.
Tange, H. J., Hasman, A., de Vries Robbé, P. F., & Schouten, H. C. (1997). Medical narratives in electronic medical records. International journal of medical informatics, 46(1), 7-29.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


