DECISION MAKING IN HEALTHCARE THROUGH MACHINE LEARNING ENHACEMENT WITH FUSION FEATURE

Devi Mampi; Sarma Manoj Kumar

Devi Mampi
Sarma Manoj Kumar

Keywords: Assamese speech, MFCLBS, fusion, cluster, MFCFB

Abstract

Machine learning is becoming a vital tool for automating decision making processes in today’s world. Machine learning algorithms evaluate information, spot trends and forecast outcomes to assist businesses in making wise decisions and also can help with disease diagnosis, prognostication, treatment plan personalization, resource allocation optimization by training machine learning algorithms on historical data too. In order to better understand how machine learning might be used on healthcare decision making, we did this research work.

We have tried to present a fine approach using clustering visualization to enhance health status based on some sentence and word analysis using some unsupervised machine learning algorithms like KMeans and Spectral clustering techniques that are most common to find hidden structures, correlations and trends in healthcare data based on speech signals that are not labeled. Fusion feature is an added advantage that has been created by combining several different distinct features. To facilitate pattern recognition and interpretation, we did cluster visualization using PCA as a feature reduction method on the features to verify the effectiveness of the suggested method on two of our primary datasets and lastly we have applied the same method on an online health dataset for comparison.

In the observation stage, the clustering visualizations have helped with health status results by revealing distinct cluster alterations linked to particular medical disorders. The overall research has impacted on significant potential applications on speech recognition and in future it may impact on speech therapy, real time disease detection and remote health monitoring systems.

Author Biographies

Devi Mampi

Research Scholar, Faculty of Engineering, Assam down town University, Guwahati, Assam, 781026

Sarma Manoj Kumar

Associate Professor, Faculty of Engineering, Assam down town University, Guwahati, Assam, 781026

References

1. Afify, M., &Siohan, O. (2004). Sequential estimation with optimal forgetting for robust speech recognition. IEEE Transactions on speech and audio processing, 12(1), 19-26. https://doi.org/10.1109/TSA.2003.819954.
2. Allan, G. M., &Arroll, B. (2014). Prevention and treatment of the common cold: making sense of the evidence. Cmaj, 186(3), 190-199. https://doi.org/https://doi.org/10.1503/cmaj.121442
3. Almajai, I., & Milner, B. (2010). Visually derived wiener filters for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 19(6), 1642-1651. https://doi.org/10.1109/TASL.2010.2096212.
4. Anaraki, J. R., Moon, J., & Chau, T. (2020). Revisiting the Application of Feature Selection Methods to Speech Imagery BCI Datasets. arXiv preprint arXiv:2008.07660. https://doi.org/10.48550/arXiv.2008.07660
5. Anusuya, M., & Katti, S. K. (2010). Speech recognition by machine, a review. arXiv preprint arXiv:1001.2267. https://doi.org/https://doi.org/10.48550/arXiv.1001.2267
6. ArrutiIllarramendi, A., Cearreta Urbieta, I., Álvarez, A., Lazkano Ortega, E., & Sierra Araujo, B. (2014). Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction. https://doi.org/https://doi.org/10.1371/journal.pone.0108975
7. Bharali, S. S., & Kalita, S. K. (2018). Speaker Identification with reference to Assamese language using fusion technique. 2018 Second International Conference on Advances in Computing, Control and Communication Technology (IAC3T) (pp. 81-86). IEEE DOI: 10.1109/IAC3T.2018.8674020,
8. Bora, D. J., & Gupta, D. A. K. (2014). A comparative study between fuzzy clustering algorithm and hard clustering algorithm. arXiv preprint arXiv:1404.6059. https://doi.org/10.48550/arXiv.1404.6059
9. Cai, D., Ni, Z., Liu, W., Cai, W., Li, G., Li, M., Cai, D., Ni, Z., Liu, W., & Cai, W. (2017). End-to-End Deep Learning Framework for Speech Paralinguistics Detection Based on Perception Aware Spectrum. INTERSPEECH (pp. 3452-3456). http://dx.doi.org/10.21437/Interspeech.2017-1445(pp. ,
10. Chadha, N., Gangwar, R., & Bedi, R. (2015). Current Challenges and Application of Speech Recognition Process using Natural Language Processing: A Survey. Int. J. Comput. Appl, 131(11), 28-31. https://doi.org/ 10.5120/ijca2015907471
11. Chen, L., Ren, J., Mao, X., & Zhao, Q. (2022). Electroglottograph-based speech emotion recognition via cross-modal distillation. Applied Sciences, 12(9), 4338. https://doi.org/https://doi.org/10.3390/app12094338
12. Cheng, K., Zhang, C., Yu, H., Yang, X., Zou, H., & Gao, S. (2019). Grouped SMOTE with noise filtering mechanism for classifying imbalanced data. IEEE Access, 7, 170668-170681. https://doi.org/doi: 10.1109/ACCESS.2019.2955086.
13. Cohen, A., &Zigel, Y. (2002). On feature selection for speaker verification. Proc. COST 275 workshop on The Advent of Biometrics on the Internet (pp. 89-92). https://www.semanticscholar.org/paper/ON-FEATURE-SELECTION-FOR-SPEAKER-VERIFICATION-Cohen-Zigel/9a831fc37dc5a1be473b68675730ddf6d561406c
14. Dauda, A., & Bhoi, N. (2014). Facial expression recognition using PCA & distance classifier. Int J Sci Eng Res, 5. http://www.ijser.org/
15. Delacourt, P., &Wellekens, C. J. (2000). DISTBIC: A speaker-based segmentation for audio data indexing. Speech communication, 32(1-2), 111-126. https://doi.org/https://doi.org/10.1016/S0167-6393(00)00027-3
16. Delić, V., Perić, Z., Sečujski, M., Jakovljević, N., Nikolić, J., Mišković, D., Simić, N., Suzić, S., &Delić, T. (2019). Speech technology progress based on new machine learning paradigm. Computational intelligence and neuroscience, 2019. https://doi.org/https://doi.org/10.1155/2019/4368036
17. Deng, L., Droppo, J., & Acero, A. (2005). Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion. IEEE Transactions on speech and audio processing, 13(3), 412-421. https://doi.org/doi: 10.1109/TSA.2005.845814
18. Dragomiretskiy, K., &Zosso, D. (2013). Variational mode decomposition. IEEE transactions on signal processing, 62(3), 531-544. https://doi.org/doi: 10.1109/TSP.2013.2288675.
19. Eccles, R. (2005). Understanding the symptoms of the common cold and influenza. The Lancet infectious diseases, 5(11), 718-725. https://doi.org/DOI:https://doi.org/10.1016/S1473-3099(05)70270-X
20. Eljawad, L., Aljamaeen, R., Alsmadi, M., Almarashdeh, I., Abouelmagd, H., Alsmadi, S., Haddad, F., Alkhasawneh, R., & Alazzam, M. (2019). Arabic voice recognition using fuzzy logic and neural network. Eljawad, L., Aljamaeen, R., Alsmadi, MK, Al-Marashdeh, I., Abouelmagd, H., Alsmadi, S., Haddad, F., Alkhasawneh, RA, Alzughoul, M. & Alazzam, MB, 651-662. https://ssrn.com/abstract=3422784
21. Huang, C.-L., Tsao, Y., Hori, C., &Kashioka, H. (2011). Feature normalization and selection for robust speaker state recognition. 2011 International Conference on Speech Database and Assessments (Oriental COCOSDA) (pp. 102-105). IEEE,
22. Huckvale, M. A., & Beke, A. (2017). It sounds like you have a cold! Testing voice features for the Interspeech 2017 Computational Paralinguistics Cold Challenge. International Speech Communication Association (ISCA).
23. Jia, W., Sun, M., Lian, J., & Hou, S. (2022). Feature dimensionality reduction: a review. Complex & Intelligent Systems, 8(3), 2663-2693. https://doi.org/https://doi.org/10.1007/s40747-021-00637-x
24. Juang, B. H., & Chen, T. (1998). The past, present, and future of speech processing. IEEE signal processing magazine, 15(3), 24-48. https://doi.org/doi: 10.1109/79.671130.
25. Kim, D.-S., Lee, S.-Y., & Kil, R. M. (1999). Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Transactions on speech and audio processing, 7(1), 55-69. https://doi.org/doi: 10.1109/89.736331
26. Marcu, D., &Echihabi, A. (2002a). An unsupervised approach to recognizing discourse relations. Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 368-375).https://doi.org/10.3115/1073083.1073145
27. Metze, F., Polzehl, T., & Wagner, M. (2009). Fusion of acoustic and linguistic features for emotion detection. 2009 IEEE International Conference on Semantic Computing.(pp. 153-160). IEEE. DOI:10.1109/ICSC.2009.32
28. Nptel Clustering Algorithms http://nptel.ac.in/courses/106108057/module14/lecture34.pdf.
29. Ozerov, A., Lagrange, M., & Vincent, E. (2011). GMM-based classification from noisy features. International Workshop on Machine Listening in Multisource Environments (CHiME 2011),
30. Sarma, H., Saharia, N., Sharma, U., Sinha, S. K., & Malakar, M. J. (2013). Development and transcription of Assamese speech corpus. arXiv preprint arXiv:1309.7312. https://doi.org/10.48550/arXiv.1309.7312
31. Sarma, M. P., & Sarma, K. K. (2011). Assamese numeral speech recognition using multiple features and cooperative LVQ-architectures. International Journal of Electronics and Communication Engineering, 5(9), 1263-1273. https://doi.org/http://scholar.waset.org/1307-6892/12746
32. Sazonov, E. (2010). Clustering (xu, r. and wunsch, dc; 2008)[book review]. IEEE Pulse, 1(1), 74-76. https://doi.org/doi: 10.1109/MPUL.2010.937237.
33. Suresh, A. K., KM, S. R., & Ghosh, P. K. (2017). Phoneme State Posteriorgram Features for Speech Based Automatic Classification of Speakers in Cold and Healthy Condition. INTERSPEECH (pp. 3462-3466). DOI:10.21437/Interspeech.2017-1550
34. Talukdar, P., Sarma, M., & Sarma, K. K. (2013). Recognition of Assamese SpokenWords using a Hybrid Neural Framework and Clustering Aided Apriori Knowledge. WSEAS Transactions on Systems(7), 360-370. https://www.wseas.org/wseas/cms.action?id=6952
35. Thakuria, L. K., & Talukdar, P. (2014). Automatic Syllabification Rules for ASSAMESE Language. International Journal of Engineering Research and Applications, 4(2), 446-450. www.ijera.com
36. Wessel, F., & Ney, H. (2004). Unsupervised training of acoustic models for large vocabulary continuous speech recognition. IEEE Transactions on speech and audio processing, 13(1), 23-31. https://doi.org/doi: 10.1109/TSA.2004.838537.
37. Yadav, A., & Singh, S. K. (2016). An improved K-means clustering algorithm. International Journal of Computing, 5(2), 88-103. https://doi.org/http://www.meacse.org/ijcar
38. Zhang, B., Titov, I., Haddow, B., &Sennrich, R. (2020). Adaptive feature selection for end-to-end speech translation. arXiv preprint arXiv:2010.08518. https://doi.org/10.48550/arXiv.2010.08518
39. Zigel, Y., & Cohen, A. (2004). Text-dependent speaker verification using feature selection with recognition related criterion. ODYSSEY04-The Speaker and Language Recognition Workshop 12(1), 19-26.,www.semanticscholar.org › paper › Text-dependent