PriyankaHundalekar/Hindi-Offensive-Analyzer-MuRIL

Hindi-Offensive-Analyzer-MuRIL

Model Description

Overview

Hindi-Offensive-Analyzer-MuRIL is a fine-tuned language model based on MuRIL (Multilingual Representations for Indian Languages), a powerful BERT-based model designed to handle a diverse range of 17 Indian languages, including their transliterated counterparts. This fine-tuned model has been specifically tailored for the task of classifying hate and non-hate comments in Hindi.

MuRIL Base Cased

The MuRIL model serves as the foundation for Hindi-Offensive-Analyzer-MuRIL. MuRIL is a language model pre-trained on a vast dataset containing text from various Indian languages. It has been developed with a unique training paradigm that is similar to multilingual BERT, with additional modifications to enhance its performance on low-resource languages.

Application: Hindi Hate Speech Comment Classification

Hindi-Offensive-Analyzer-MuRIL has been fine-tuned specifically for the task of classifying comments written in Hindi as either "Hate" or "Non-Hate”. This model can effectively analyze text and distinguish offensive content from non-offensive content in the Hindi language. It is a valuable tool for applications that require hate speech detection and moderation on platforms and websites that host content in Hindi.

Label 0 : Non-Hate

Label 1 : Hate

Hardware Requirements:

Processor: Minimum i3 or AMD Ryzen 3 processor
RAM: 12 GB
GPU: 16 GB Tesla T4

Software Requirements:

Operating System: Windows 10
Processor: Intel® Core™ i5-6200U CPU @ 2.30GHz × 4
Programming Language: Python 3
Development Environment: Google Colab Pro Notebook

Use Cases

Hindi-Offensive-Analyzer-MuRIL can be used in a variety of applications, including content moderation, social media monitoring and sentiment analysis. It aids in promoting a safe online environment by automatically identifying and flagging potentially harmful or offensive content.

Acknowledgments

This model builds upon the foundation of the MuRIL language model, which is the result of collaborative research and contributions from the NLP community. We extend our appreciation to the creators of MuRIL for their work in advancing the understanding and processing of Indian languages.

Developed by: Priyanka Hundalekar
Model type: Text Classification
Language(s) (NLP): Python
Finetuned from model [optional]: google/muril-base-cased