Model Card for Fake News Detection Model
Model Summary
This is a fine-tuned DistilBERT model for fake news detection. It classifies news articles as either real or fake based on textual content. The model has been trained on a labeled dataset consisting of true and false news articles collected from various sources.
Model Details
Model Description
- Developed by: Dhruv Pal
- Finetuned from:
distilbert-base-uncased
- Language: English
- Model type: Transformer-based text classification model
- License: MIT
- Intended Use: Fake news detection on social media and news websites
Model Sources
- Repository: Hugging Face Model Hub
- Paper (if applicable): N/A
- Demo (if applicable): N/A
Uses
Direct Use
- This model can be used to detect whether a given news article is real or fake.
- It can be integrated into fact-checking platforms, misinformation detection systems, and social media moderation tools.
Downstream Use
- Can be further fine-tuned on domain-specific fake news datasets.
- Useful for media companies, journalists, and researchers studying misinformation.
Out-of-Scope Use
- This model is not designed for generating news content.
- It may not work well for languages other than English.
- Not suitable for fact-checking complex claims requiring external knowledge.
Bias, Risks, and Limitations
Risks
- The model may be biased towards certain topics, sources, or writing styles based on the dataset used for training.
- There is a possibility of false positives (real news misclassified as fake) or false negatives (fake news classified as real).
- Model performance can degrade on out-of-distribution samples.
Recommendations
- Users should not rely solely on this model for determining truthfulness.
- It is recommended to use human verification and cross-check information from multiple sources.
How to Use the Model
You can load the model using transformers
and use it for inference as shown below:
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
import torch
tokenizer = DistilBertTokenizerFast.from_pretrained("your-model-id")
model = DistilBertForSequenceClassification.from_pretrained("your-model-id")
def predict(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
outputs = model(**inputs)
probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
return "Fake News" if torch.argmax(probs) == 1 else "Real News"
text = "Breaking: Scientists discover a new element!"
print(predict(text))
Training Details
Training Data
The model was trained on a dataset consisting of news articles labeled as real or fake. The dataset includes information from reputable sources and misinformation websites.
Training Procedure
Preprocessing:
- Tokenization using
DistilBertTokenizerFast
- Removal of stop words and punctuation
- Converting text to lowercase
- Tokenization using
Training Configuration:
- Model:
distilbert-base-uncased
- Optimizer: AdamW
- Batch size: 16
- Epochs: 3
- Learning rate: 2e-5
- Model:
Compute Resources
- Hardware: NVIDIA Tesla T4 (Google Colab)
- Training Time: ~2 hours
Evaluation
Testing Data
- The model was evaluated on a held-out test set of 10,000 news articles.
Metrics
- Accuracy: 92%
- F1 Score: 90%
- Precision: 91%
- Recall: 89%
Results
Metric | Score |
---|---|
Accuracy | 92% |
F1 Score | 90% |
Precision | 91% |
Recall | 89% |
Environmental Impact
- Hardware Used: NVIDIA Tesla T4
- Total Compute Time: ~2 hours
- Carbon Emissions: Estimated using the ML Impact Calculator
Technical Specifications
Model Architecture
- The model is based on DistilBERT, a lightweight transformer architecture that reduces computation while retaining accuracy.
Dependencies
transformers
torch
datasets
scikit-learn
Citation
If you use this model, please cite it as:
@misc{DhruvPal2025FakeNewsDetection,
title={Fake News Detection with DistilBERT},
author={Dhruv Pal},
year={2025},
howpublished={\url{https://huggingface.co/your-model-id}}
}
Contact
For any queries, feel free to reach out:
- Author: Dhruv Pal
- Email: dhruv416pal@gmail.com
- GitHub: dhruvpal05
- LinkedIn: idhruvpal
- Downloads last month
- 120
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.