Saving 77% of the Parameters in Large Language Models Technical Report

This repository contains experiment results for the Saving 77% of the Parameters in Large Language Models Technical Report (PDF).

Abstract

This technical report demonstrates that large language models (LLMs) can maintain their learning capacity while reducing their non-embedding parameters by up to 77%. We achieve this by adapting a parameter reduction technique originally developed for computer vision, replacing dense layers with an optimized subnetwork that contains grouped pointwise convolutions. Using Microsoft's phi-3-mini-4k-instruct as our baseline, we show that our optimized model (kphi-3) achieves comparable validation loss while using only 15-23% of the original non-embedding parameters. All experiments were conducted on a single NVIDIA L4 GPU within a 3-day timeframe, supporting the democratization of AI research. Our findings suggest that current LLM architectures may be substantially overparameterized, opening possibilities for more efficient model training and deployment.

Key Findings

  • Achieved 77% parameter reduction while maintaining model performance.
  • Demonstrated better generalization in optimized models.
  • Improved output quality in qualitative testing.

Implementation Details

  • Base Model: kphi3.
  • Training Dataset: LaMini.
  • Architecture: Modified transformer decoder with grouped pointwise convolutions.

Results

The following table shows LaMini training results with the baseline and the optimized versions. From left to right: experiment label, model name, number of transformer decoder layers, intermediate dimensions, number of non-embedding parameters, training loss and validation loss.

label model layers interm. dims. non-emb. params. % Train Loss Val. Loss
JP47D54C phi-3 2 8192 227M 1.08 1.58
JP47D55C kphi-3 2 9216 35M 15% 1.26 1.60
JP47D56C kphi-3 3 9216 53M 23% 1.21 1.57

Quick Links

Usage:

from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig, pipeline
from transformers import LlamaTokenizer
import torch

REPO_NAME = 'schuler/experimental-JP47D55C'

def load_model(local_repo_name):
    tokenizer = LlamaTokenizer.from_pretrained(local_repo_name, trust_remote_code=True)
    generator_conf = GenerationConfig.from_pretrained(local_repo_name)
    model = AutoModelForCausalLM.from_pretrained(local_repo_name, trust_remote_code=True, torch_dtype=torch.bfloat16, attn_implementation="eager")
    # model.to('cuda')
    return tokenizer, generator_conf, model

tokenizer, generator_conf, model = load_model(REPO_NAME)
global_error = ''
try:
  generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
except Exception as e:
  global_error =  f"Failed to load model: {str(e)}"

def PrintTest(str):
  print(generator(str, max_new_tokens=256, do_sample=True, top_p=0.25, repetition_penalty=1.2))

PrintTest(f"<|user|>\nHello\n<|end|>\n<|assistant|>\n")
PrintTest(f"<|user|>Hello\n<|end|><|assistant|>")
PrintTest(f"<|user|>\nWhat is the human body?\n<|end|>\n<|assistant|>\n")
PrintTest(f"<|user|>What is the human body?\n<|end|><|assistant|>")
PrintTest(f"<|user|>What is biology?\n<|end|><|assistant|>")
PrintTest(f"<|user|>Can you comment about democracy?\n<|end|><|assistant|>")
PrintTest(f"<|user|>Can you provide detailed comments about the concept of democracy?\n<|end|><|assistant|>")
PrintTest(f"<|user|>Please give me a detailed description of the python computer language.\n<|end|><|assistant|>")
PrintTest(f"<|user|>If you had a difficult task to complete, how would you complete it?\n<|end|><|assistant|>")
PrintTest(f"<|user|>Before replying to my question, I would like you to provide two candidate solutions and do some reflexion about these solution. Then, you'll pick the best as the final reply. What is best: eating healthy or eating economically?\n<|end|><|assistant|>")

Output Examples

[{'generated_text': '<|user|>\nHello\n<|end|>\n<|assistant|>\n Okay. Please let me know what time you are thinking, and I will do my best to help out. <|end|>'}]
[{'generated_text': '<|user|>I have a question to you. Before you reply to my question, I need to to comment details about the subject without trying to directly reply to the question. As an example, if I ask you about modern architecture, you\'ll start describing what is architecture without specifics about modernism. Then, later you can reply to the question. Therefore, here we go with the question: what is ancient history?<|end|><|assistant|> The statement "Architecture originated from modern architecture" is not explicitly mentioned in the given information. However, it is true that the modernist architectural style of modern architecture may be based on historical and cultural factors. <|end|>'}]
[{'generated_text': "<|user|>Hello\n<|end|><|assistant|> Okay. When is the last time you have seen it happen? The last time I used my work has been on a Saturday night, and I was wondering if there were any changes to my life plan. The following day, we'll get together for an early start date:  \n1) The new year's first project will be released soon.\n2) My next one would be to attend the meeting in person by then so that all of our friends can enjoy the company.\n3) At the end of the month, I think the new part of the day was pretty well organized.\n4) What should I do when I am done and what steps are taken to accomplish each day?\n5) Are there any plans for the rest of the week to continue with your work or projects?\n6) Do anything exciting happening during the day or at night.\n7) Have dinner reservations planned out in advance, but please stay on track and arrive early on time.\n8) Follow any additional prompts or instructions provided by the team or individual. <|end|>"}]
[{'generated_text': '<|user|>\nWhat is the human body?\n<|end|>\n<|assistant|>\n The human body is a complex organ that is responsible for many different functions, including regulating hormones, growth and immune functioning. It also plays a crucial role in maintaining healthy bone tissue by controlling bodily functions such as digestion, metabolism, and movement of blood throughout the body. <|end|>'}]
[{'generated_text': '<|user|>What is the human body?\n<|end|><|assistant|> The human body is a complex organ that provides structure and function for various organs, including the heart, lungs, liver, kidneys, and brain. It also has many bones and internal organs such as the femur, tibia, fibula, and humerus that connects them to the pharynx and larynx. The circulatory system is important for maintaining proper bodily functions and protecting healthy organs from damage or injury by producing blood cells that regulate metabolism. <|end|>'}]
[{'generated_text': '<|user|>What is biology?\n<|end|><|assistant|> Biology is the study of living organisms and their interactions with each other, which helps us understand how different species interact with one another. This understanding can lead to better interrelationships between different organisms within a given ecosystem or population, ultimately impacting our ability to maintain healthy relationships with others. <|end|>'}]
[{'generated_text': '<|user|>Can you comment about democracy?\n<|end|><|assistant|> Democracy is a form of government in which power is held by the people, either directly or through elected representatives. It can be exercised by an individual vote, freedom from political participation to decisions made on their behalf, and citizens have no right to choose where they want to participate. The laws are designed to ensure that all citizens have equal access to education for both citizens and hold accountable, regardless of their status or position. Therefore, there is a strong focus on issues such as civil liberties, social justice, human rights, and democracy. <|end|>'}]
[{'generated_text': '<|user|>Can you provide detailed comments about the concept of democracy?\n<|end|><|assistant|> Democracy is a form of government in which power is held by the people and decisions are made through majority rule. In this system, citizens have to participate in decision-making processes related to voting, such as voting or vetoing elections, while also serving others who may not be able to make informed decisions independently. The term "democracy" (discontinued) emphasizes the importance of human rights, fairness, and respect for individual rights, while the alternative forms of social participation have been established that oppressive laws do not apply to every citizen status, but rather must address their concerns through peaceful means. <|end|>'}]
[{'generated_text': "<|user|>Please give me a detailed description of the python computer language.\n<|end|><|assistant|> I'm sorry, but I need more information to provide a specific Python topic for your programming project and any other relevant details. Could you please provide me with additional details or context? <|end|>"}]
[{'generated_text': '<|user|>If you had a difficult task to complete, how would you complete it?\n<|end|><|assistant|> Difficulty in completing the task. <|end|>'}]
[{'generated_text': "<|user|>Before replying to my question, I would like you to provide two candidate solutions and do some reflexion about these solution. Then, you'll pick the best as the final reply. What is best: eating healthy or eating economically?\n<|end|><|assistant|> Eat a healthy, balanced diet with both eyes and mouth when replying to your question may be the most effective way to improve one's overall well-being. <|end|>"}]

Citing this Model

@article{SchulerRojas_2025,
  title={Saving 77% of the Parameters in Large Language Models Technical Report},
  url={https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT},
  author={Schwarz Schuler, Joao Paulo and Rojas Gómez, Alejandra},
  year={2025}}
Downloads last month
8
Safetensors
Model size
232M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API does not yet support model repos that contain custom code.

Dataset used to train schuler/experimental-JP47D55C