Code-switching Detection - Approaches and Evaluation: Investigating approaches and evaluation methods for code-switching detection in multilingual text data to identify language switches within sentences

Authors

  • Hyejin Kim Lecturer, Health Informatics Division, Mount Fuji Institute of Technology, Osaka, Japan Author

Keywords:

Code-switching detection, multilingual text data, natural language processing

Abstract

Code-switching, the alternation between two or more languages within a single discourse, is a prevalent linguistic phenomenon in multilingual communities. Detecting code-switching in text data is essential for various natural language processing (NLP) tasks, such as machine translation, sentiment analysis, and information retrieval, to ensure accurate language processing. This paper provides a comprehensive overview of approaches and evaluation methods for code-switching detection in multilingual text data. We examine the challenges associated with code-switching detection, including the lack of annotated datasets, the complexity of language mixing patterns, and the need for context-aware detection algorithms.

The paper discusses various approaches used for code-switching detection, including rule-based methods, statistical models, and deep learning techniques. Rule-based methods rely on linguistic rules and patterns to identify language switches, while statistical models utilize probabilistic models to detect code-switching based on lexical and syntactic features. Deep learning techniques, such as recurrent neural networks (RNNs) and transformer models, have shown promising results in code-switching detection by leveraging the contextual information of text data.

Furthermore, we explore evaluation methods for code-switching detection, including accuracy, precision, recall, and F1 score. We discuss the importance of annotated datasets for evaluating code-switching detection systems and the challenges of cross-lingual evaluation in code-switching detection. We also review existing annotated datasets and evaluation benchmarks for code-switching detection to facilitate future research in this area.

Downloads

Download data is not yet available.

References

Tatineni, Sumanth. "Blockchain and Data Science Integration for Secure and Transparent Data Sharing." International Journal of Advanced Research in Engineering and Technology (IJARET) 10.3 (2019): 470-480.

Downloads

Published

2021-05-30

How to Cite

[1]
Hyejin Kim, “Code-switching Detection - Approaches and Evaluation: Investigating approaches and evaluation methods for code-switching detection in multilingual text data to identify language switches within sentences”, J. of Artificial Int. Research and App., vol. 1, no. 1, pp. 1–9, May 2021, Accessed: Jun. 28, 2024. [Online]. Available: https://aimlstudies.co.uk/index.php/jaira/article/view/43

Similar Articles

1-10 of 56

You may also start an advanced similarity search for this article.