HASOC (2024)

Hate Speech and Offensive Content Identification in English and Bangla

Datasets

Dataset

  • For Task 1 participants are allowed to use any external resources and datasets. Test dataset will be released later for the evaluation of models.
  • Task 2 dataset is available at https://github.com/LanguageTechnologyLab/TB-OLID
  • HASOC 2024 Datasets

    Category

    Train Dataset

    Test Dataset

    Task 1 English

    NA

    Download

    Task 2 Bangla

    Download

    Download

    HASOC 2022 Dataset

    Task

    Training Data

    Test Data

    Identification of Conversational Hate-Speech in Code-Mixed Languages(ICHCL) (Task-1 and Task 2)

    Download

    Download

    Offensive Language Identification in Marathi (Task-3A, 3B, 3C)

    Download

    Download

    HASOC 2021 Dataset

    Subtask 1 Dataset

    Category

    Train Dataset

    Test Dataset

    English Dataset

    Download

    Download

    Hindi Dataset

    Download

    Download

    Marathi Dataset

    Download

    Download

    Subtask 2 Dataset

    Category

    Train Dataset

    Test Dataset

    English-Hindi Code-Mix Dataset

    Download

    Download

    To know more about the subtasks, click here.

    HASOC 2020 Dataset

    Category

    Link

    English Dataset

    Download

    Hindi Dataset

    Download

    German Dataset

    Download

    To know more click here.

    HASOC 2019 Dataset

    Category

    Link

    English Dataset

    Download

    Hindi Dataset

    Download

    German Dataset

    Download

    To know more click here.

    Contact us

    Subscribe to our mailing list for the latest announcements and discussions.

    For any queries write to us at hasoc@googlegroups.com