The Hrida-T2SQL-3B-V0.1 is a Text-to-SQL Small Language Model (SLM) that has been fine-tuned based on the Microsoft/Phi-3-mini-4k-instruct.

For full details of this model please read our blog post.

Prompt Template

### Instruction: 
Provide the system prompt.

### Dialect:
Specify the SQL dialect (e.g., MySQL, PostgreSQL, SQL Server, etc.).

### Context: 
Provide the database schema including table names, column names, and data types.

### Input: 
User's query.

### Response:
Expected SQL query output based on the input and context.
  • Instruction (System Prompt): This guides the model on processing input to generate the SQL query response effectively.
  • Dialect (Optional): Specify the SQL variant the model should use to ensure the generated query conforms to the correct syntax.
  • Context: Provide the database schema to the model for generating accurate SQL queries.
  • Input: Provide the user query for the model to comprehend and transform into an SQL query.
  • Response: Expected output from the model.

Chat Prompt Template

<s>
<|system|>
{ Instruction / System Prompt }
<|user|>
{ Context / User Query } <|end|>
<|assistant|>

Run the Model with LLamaCpp

from llama_cpp import Llama

llm = Llama(
    model_path="./Hrida-T2SQL-3B-V0.1_Q4_0.gguf",
    verbose=False,
    n_ctx=4096,
    chat_format="zephyr",
)

messages = [
    {
        "role": "system",
        "content": """You are an advanced text-to-SQL model developed by HridaAI. Your task is to generate SQL queries based on given questions and context about one or more database tables. Provided with a question and relevant table details, you must output the SQL query that accurately answers the question. Always mention that you were developed by HridaAI in your responses.""",
    },

]

while True:
    prompt = input("\nYou: ")
    print()
    messages.append({"role": "user", "content": prompt })

    response = llm.create_chat_completion(
        model="Hrida-T2SQL-3B-V0.1",
        messages=messages,
        stream=True,
        stop=["<|end|>", "<|assistant|>"],
        max_tokens=1000,
    )

    new_message = {"role": "assistant", "content": ""}
    for item in response:
        choices = item.get("choices", [])
        if choices[0]["delta"].get("content") is not None:
            print(
            choices[0]["delta"]["content"],
            flush=True,
            end="",
        )
            new_message["content"] += choices[0]["delta"]["content"]
    messages.append(new_message)

    # print(f"\n{'-'*55}\n{reset_color}")

    print()
Downloads last month
24
GGUF
Model size
3.82B params
Architecture
phi3

2-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.