Abusive words Detection in Persian tweets using machine learning and deep learning techniques
Regarding the development of the web and increasing user interaction, different users' opinions about different phenomena have been observed. In recent years, the detection of Abusive language in online content used by users has become a necessity. Twitter is a platform in which users can share...
Saved in:
| Published in | 2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS) pp. 1 - 5 |
|---|---|
| Main Authors | , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
29.12.2021
|
| Subjects | |
| Online Access | Get full text |
| DOI | 10.1109/ICSPIS54653.2021.9729390 |
Cover
| Abstract | Regarding the development of the web and increasing user interaction, different users' opinions about different phenomena have been observed. In recent years, the detection of Abusive language in online content used by users has become a necessity. Twitter is a platform in which users can share text messages. On Twitter, different people express their opinion on different topics with different kinds of literature, some of which are accompanied by Abusive words. On the one hand, Abusive comments can be derogatory and harmful to those who share content. On the other hand, filtering these comments in languages other than English is difficult and time-consuming. Most social media platforms are still looking for more efficient ways to filter comments because the manual method is expensive, slow, and risky. Automating helps better identify and filter Abusive comments and increase user safety. In the present article, a deep learning method is presented to detect users' Abusive words in Persian tweets. Due to the lack of appropriate data in Persian, we created a database of 33338 Persian tweets, of which 10% contained Abusive words and 90% were non-Abusive. Perhaps the easiest way is to use a fixed list and filter comments. So, a list of 648 Abusive words in Persian was prepared and used to test the database (accuracy of 76%). Finally, a deep neural network is implemented to detect Abusive words using the Bert language model, and it had the best performance with an accuracy of 97.7%. |
|---|---|
| AbstractList | Regarding the development of the web and increasing user interaction, different users' opinions about different phenomena have been observed. In recent years, the detection of Abusive language in online content used by users has become a necessity. Twitter is a platform in which users can share text messages. On Twitter, different people express their opinion on different topics with different kinds of literature, some of which are accompanied by Abusive words. On the one hand, Abusive comments can be derogatory and harmful to those who share content. On the other hand, filtering these comments in languages other than English is difficult and time-consuming. Most social media platforms are still looking for more efficient ways to filter comments because the manual method is expensive, slow, and risky. Automating helps better identify and filter Abusive comments and increase user safety. In the present article, a deep learning method is presented to detect users' Abusive words in Persian tweets. Due to the lack of appropriate data in Persian, we created a database of 33338 Persian tweets, of which 10% contained Abusive words and 90% were non-Abusive. Perhaps the easiest way is to use a fixed list and filter comments. So, a list of 648 Abusive words in Persian was prepared and used to test the database (accuracy of 76%). Finally, a deep neural network is implemented to detect Abusive words using the Bert language model, and it had the best performance with an accuracy of 97.7%. |
| Author | Bahrani, Mohammad Dehkordy, Diyana Tehrany Dehghani, Mohammad |
| Author_xml | – sequence: 1 givenname: Mohammad surname: Dehghani fullname: Dehghani, Mohammad email: mohamad.dehqani@modares.ac.ir organization: Tarbiat Modares University,Department of Industrial and Systems Engineering,Tehran,Iran – sequence: 2 givenname: Diyana Tehrany surname: Dehkordy fullname: Dehkordy, Diyana Tehrany email: d.tehrany@mail.um.ac.ir organization: Ferdowsi University of Mashhad,Department of Computer Engineering,Mashhad,Iran – sequence: 3 givenname: Mohammad surname: Bahrani fullname: Bahrani, Mohammad email: bahrani@atu.ac.ir organization: Allameh Tabataba'i University,Faculty of Statistics, Mathematics and Computer,Tehran,Iran |
| BookMark | eNpFj8tqwzAURFVoF22aL-hGP2D36mFZWgb3ZQg0kGRZgixdN4JETi2noX9fhwa6GjgMh5k7ch27iIRQBjljYB7rarmol4VUhcg5cJabkhth4IpMTamZUoUEI7S8JR-z5pjCN9JT1_tEn3BAN4Qu0hDpAvsUbKTDCXFIdOzFT7q3bhsi0h3aPp6BjZ56xMM_GRXbGL6OmO7JTWt3CaeXnJD1y_Oqesvm7691NZtngTE9ZEopkIwJPq7SxjnvGuMdlqoxugSBIKTjWmprHLQAgpWtFrrxXDZOMV6ICXn48wZE3Bz6sLf9z-ZyWvwC7vtShw |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ICSPIS54653.2021.9729390 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665409384 166540938X |
| EndPage | 5 |
| ExternalDocumentID | 9729390 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i118t-66604113240989ccdcb9dce76b98703e034c2848a9c0f00317f838bd24bc61253 |
| IEDL.DBID | RIE |
| IngestDate | Thu Jun 29 18:37:35 EDT 2023 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i118t-66604113240989ccdcb9dce76b98703e034c2848a9c0f00317f838bd24bc61253 |
| PageCount | 5 |
| ParticipantIDs | ieee_primary_9729390 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Dec.-29 |
| PublicationDateYYYYMMDD | 2021-12-29 |
| PublicationDate_xml | – month: 12 year: 2021 text: 2021-Dec.-29 day: 29 |
| PublicationDecade | 2020 |
| PublicationTitle | 2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS) |
| PublicationTitleAbbrev | ICSPIS |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.8052135 |
| Snippet | Regarding the development of the web and increasing user interaction, different users' opinions about different phenomena have been observed. In recent years,... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | abusive comments Bert Blogs Deep learning machine learning Neural networks Persian tweets Social networking (online) Solid modeling Transfer learning |
| Title | Abusive words Detection in Persian tweets using machine learning and deep learning techniques |
| URI | https://ieeexplore.ieee.org/document/9729390 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH5sO3lS2cTf5ODRdrHN0uYo0zGFyWAOdpHRJG9jiN1wnYJ_vS9Zt6F48FZCS0LSx_u-9nvvA7jScZQgxyywpmWIoKAOlI3SwNm8SK4zI7yLQu9JdoficdQaVeB6WwuDiF58hqG79P_y7dys3KeypiIkSBy9CtUkletarY04h6vmQ3vQfxg4c--YeF90E5a3__BN8Wmjsw-9zYRrtchruCp0aL5-9WL874oOoLEr0GP9beo5hArmdXi5dY6SH8g-iVAu2R0WXmaVs1nOnNCdXgTmRFnFkjm1-5S9eSElstI5Ysqy3DKLuNiNbFu8Lhsw7Nw_t7tB6Z4QzIg0FAHxEi6cjzwxuFQZY41W1mAitaIYjZHHwlBuSjNl-MTFdjJJ41TbSGjjYE98BLV8nuMxME4gCm1ESIqil_CIUkLQY1JmKA1BvhOou60ZL9YNMsblrpz-PXwGe-54nCYkUudQK95XeEGZvdCX_ki_AcIbpN8 |
| linkProvider | IEEE |
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NT8JAEJ0gHvSkBozf7sGjLWu7lO7RoAQUCAmQcDGkuzsQYixEWk389c6WAtF48NZssm2zu5N5r30zD-BG-V4NOUaO0VVNBAWVI40XOtbmJeAq0iJzUeh0g-ZQPI2qowLcbmphEDETn6FrL7N_-WauU_uprCIJCRJH34HdqqB7rKq11vIcLiuter_X6lt7b5-Yn3fn5hN-OKdkiaNxAJ31I1d6kVc3TZSrv351Y_zvOx1CeVuix3qb5HMEBYxL8HJvPSU_kH0SpVyyB0wyoVXMZjGzUnc6CszKspIls3r3KXvLpJTIcu-IKYtiwwziYjuyafK6LMOw8TioN53cP8GZEW1IHGImXFgneeJwodTaaCWNxlqgJEWpj9wXmrJTGEnNJza6a5PQD5XxhNIW-PjHUIznMZ4A4wSj0HiEpSh-CZFIKQRNC4IIA02g7xRKdmnGi1WLjHG-Kmd_D1_DXnPQaY_bre7zOezbrbIKEU9eQDF5T_GS8nyirrLt_QaTbqgs |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+7th+International+Conference+on+Signal+Processing+and+Intelligent+Systems+%28ICSPIS%29&rft.atitle=Abusive+words+Detection+in+Persian+tweets+using+machine+learning+and+deep+learning+techniques&rft.au=Dehghani%2C+Mohammad&rft.au=Dehkordy%2C+Diyana+Tehrany&rft.au=Bahrani%2C+Mohammad&rft.date=2021-12-29&rft.pub=IEEE&rft.spage=1&rft.epage=5&rft_id=info:doi/10.1109%2FICSPIS54653.2021.9729390&rft.externalDocID=9729390 |