--- license: llama2 --- # Introducing Code Millenials 13B Welcome to our Code Model repository! Our model is specifically fine-tuned for code generation tasks, aiming to revolutionize how systems understand and translate natural language instructions into code queries. Built on CodeLLaMa 13B, our model has been meticulously fine-tuned with a curated code generation instructions, ensuring quality and precision. The model has capability of 120K+ sequence length without affecting the preplexity with the implemenation of lambda attention. ## Generate responses Inference code using the pre-trained model from the Hugging Face model hub ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("budecosystem/sql-millennials-13b") model = AutoModelForCausalLM.from_pretrained("budecosystem/sql-millennials-13b") prompt = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Create SQL query for the given table schema and question ASSISTANT:" inputs = tokenizer(prompt, return_tensors="pt") sample = model.generate(**inputs, max_length=128) print(tokenizer.decode(sample[0])) ``` To get extended context length, use the generate.py file from the [github repo](https://github.com/BudEcosystem/code-millenials) ``` python generate.py --base_model budecosystem/code-millenials-13b ``` You can integrate the model in your code my loading convert_llama_model function. ```python import torch from transformers import GenerationConfig, AutoModelForCausalLM, AutoTokenizer from model.llama import convert_llama_model local_branch = 2048 global_branch = 10 limit_distance = 2048 model = AutoModelForCausalLM.from_pretrained( "budecosystem/code-millenials-13b", torch_dtype=torch.float16, device_map="auto", ) model = convert_llama_model(model, local_branch, global_branch) ``` ## Training details The model is trained of 8 A100 80GB for approximately 15hrs. | Hyperparameters | Value | | :----------------------------| :-----: | | per_device_train_batch_size | 2 | | gradient_accumulation_steps | 1 | | epoch | 3 | | steps | 19206 | | learning_rate | 2e-5 | | lr schedular type | cosine | | warmup ratio | 0.1 | | optimizer | adamw | | fp16 | True | | GPU | 8 A100 80GB |