ArbDialectID at MADAR Shared Task 1: Language Modelling and Ensemble Learning for Fine Grained Arabic Dialect Identification pdf

تفاصيل الدراسة

0المراجعات

العلوم والتكنولوجيا

ArbDialectID at MADAR Shared Task 1: Language Modelling and Ensemble Learning for Fine Grained Arabic Dialect Identification pdf

ملخص الدراسة:

In this paper, we present a Dialect Identification system (ArbDialectID) that competed at Task 1 of the MADAR shared task, MADARTravel Domain Dialect Identification. We build a course and a fine-grained identification model to predict the label (corresponding to a dialect of Arabic) of a given text. We build two language models by extracting features at two levels (words and characters). We firstly build a coarse identification model to classify each sentence into one out of six dialects, then use this label as a feature for the fine-grained model that classifies the sentence among 26 dialects from different Arab cities, after that we apply ensemble voting classifier on both sub-systems. Our system ranked 1st that achieving an f-score of 67.32%. Both the models and our feature engineering tools are made available to the research community.

توثيق المرجعي (APA)

خصائص الدراسة

المؤلف

Abu Kwaik, Kathrein

Saad, Motaz K
سنة النشر

2019-08
الناشر:

Association for Computational Linguistics
المصدر:

المستودع الرقمي للجامعة الإسلامية بغزة
نوع المحتوى:

Conference Paper
اللغة:

English
محكمة:

نعم
الدولة:

فلسطين
النص:

دراسة كاملة
نوع الملف:

pdf

معلومات الوصول

رابط الدراسة
https://iugspace.iugaza.edu.ps/bitstream/handle/20.500.12358/27253/W19-4632.pdf?sequence=2&isAllowed=y

0المراجعات

أضف تقييم

أترك تقييمك

درجة تقييم

بريدك الإلكتروني:*

ArbDialectID at MADAR Shared Task 1: Language Modelling and Ensemble Learning for Fine Grained Arabic Dialect Identification pdf

تفاصيل الدراسة

ArbDialectID at MADAR Shared Task 1: Language Modelling and Ensemble Learning for Fine Grained Arabic Dialect Identification pdf

خصائص الدراسة

معلومات الوصول

0المراجعات

أترك تقييمك

تحسين بيئة المحتوى الرقمى العالمى

Developing Safety Systems at the University’s Buildings and Linking them to the Internet of Things (IOT)

تلخيص التقارير الطبية المستند على القالب pdf

تفاصيل الدراسة

الإبلاغ عن المشكلة

ArbDialectID at MADAR Shared Task 1: Language Modelling and Ensemble Learning for Fine Grained Arabic Dialect Identification pdf

خصائص الدراسة

معلومات الوصول

0المراجعات

أترك تقييمك

دراسات المتعلقة

تحسين بيئة المحتوى الرقمى العالمى

Developing Safety Systems at the University’s Buildings and Linking them to the Internet of Things (IOT)

تلخيص التقارير الطبية المستند على القالب pdf