--- library_name: transformers license: mit datasets: - JeanKaddour/minipile language: - en --- # BEE-spoke-data/MiniTokenizer-20480 This is a `ByteLevelBPETokenizer` trained on the `JeanKaddour/minipile` dataset with the aim to create a compact English-only tokenizer. ## Usage load with AutoTokenizer, i.e.: ```py from transformers import AutoTokenizer tk = AutoTokenizer.from_pretrained('BEE-spoke-data/MiniTokenizer-20480') tk ```