Llama 2-7B Fine-Tuned for Text-to-SQL

This model is a fine-tuned version of the Llama 2-7B model, specifically adapted for Text-to-SQL tasks. The model was trained to generate SQL queries from natural language questions, providing a robust solution for systems that need to translate user queries into executable SQL code.

Model Details

  • Model Name: Llama 2-7B Fine-Tuned for Text-to-SQL
  • Base Model: Llama 2-7B
  • Model Developers: Fine-tuned by MertML
  • License: Custom commercial license. Please refer to the repository for terms.
  • Intended Use: Designed for generating SQL queries from natural language input. Ideal for applications in databases, conversational agents, and data analysis tools.

Model Architecture

Llama 2-7B is an autoregressive language model based on the transformer architecture. The fine-tuned version has been specifically adapted for the Text-to-SQL task, trained to convert user-written questions into valid and executable SQL queries using supervised fine-tuning.

Intended Use Cases

Translating natural language queries into SQL queries, suitable for database query generation, business intelligence applications, and conversational agents that interact with databases.

Out-of-Scope Uses

While this model is capable of text generation, it is fine-tuned specifically for Text-to-SQL tasks and may not perform well for general-purpose language generation tasks.

Training Data

The model was fine-tuned using the refined-sql-create-context dataset, which contains natural language queries, corresponding table schemas, and the correct SQL queries. This dataset was preprocessed to ensure that all queries were valid and executable on a MySQL database.

  • Training Data Size: 11,632 samples, split into training, validation, and test sets (80%, 10%, 10%).
  • Data Source: SQL-create-context dataset (refined for this task).
  • Data Preprocessing: Ambiguities in table schemas were resolved, invalid SQL queries were removed, and normalization was performed on SQL formatting for consistent evaluation.

Model Performance

The fine-tuned Llama 2-7B on Text-to-SQL demonstrated significant improvements over the base model in generating syntactically correct and contextually relevant SQL queries. Performance was evaluated on a set of queries with varying levels of difficulty, and the model was benchmarked against the refined-sql-create-context datasets.

Evaluation Metrics

  • Accuracy: Measures the percentage of generated SQL queries that are syntactically and semantically correct.
  • Execution Success Rate: Measures the percentage of SQL queries that execute successfully against a database.
  • Response Quality: Assesses the relevance and correctness of the generated SQL queries in context.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.