Update README.md

8a863c0 over 1 year ago

3.89 kB

	---
	license: apache-2.0
	datasets:
	- anon8231489123/ShareGPT_Vicuna_unfiltered
	- PengQu/langchain-MRKL-finetune
	language:
	- zh
	- en
	---
	# open_llama_7b_v2_vicuna_Chinese

	open_llama_7b_v2_vicuna_Chinese是在中英双语sharegpt数据上全参数微调的对话模型。

	- 基座模型：[open_llama_7b_v2](https://huggingface.co/openlm-research/open_llama_7b_v2), 允许商业使用。
	- 微调数据：ShareGPT，ShareGPT-ZH，Langchain-MRKL-finetune
	- 训练代码：基于[FastChat](https://github.com/lm-sys/FastChat)

	open_llama_7b_v2_vicuna_Chinese is a chat model supervised finetuned on vicuna sharegpt data in both English and Chinese.

	- Foundation model: [open_llama_7b_v2](https://huggingface.co/openlm-research/open_llama_7b_v2), a commercially available language model.
	- Finetuning data: ShareGPT，ShareGPT-ZH，Langchain-MRKL-finetune
	- Training code: based on [FastChat](https://github.com/lm-sys/FastChat)

	## Loading the Weights with Hugging Face Transformers
	Please note that it is advised to avoid using the Hugging Face fast tokenizer for now, as we’ve observed that [the auto-converted fast tokenizer sometimes gives incorrect tokenizations](https://github.com/huggingface/transformers/issues/24233). This can be achieved by directly using the `LlamaTokenizer` class, or passing in the `use_fast=False` option for the `AutoTokenizer` class. See the following example for usage.

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	tokenizer = AutoTokenizer.from_pretrained("PengQu/open_llama_7b_v2_vicuna_Chinese",use_fast=False)
	model = AutoModelForCausalLM.from_pretrained("PengQu/open_llama_7b_v2_vicuna_Chinese").to("cuda")

	instruction = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {} ASSISTANT:"
	prompt = instruction.format('用flask写一个简单的http服务器。')
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")

	generation_output = model.generate(input_ids=input_ids, max_new_tokens=512)
	print(tokenizer.decode(generation_output[0],skip_special_tokens=True))
	```

	输出如下（output as follows）：<br>
	```
	用flask写一个简单的http服务器。

	from flask import Flask
	app = Flask(__name__)
	@app.route('/')
	def hello():
	return 'Hello, World!'
	if __name__ == '__main__':
	app.run()

	这段代码定义了一个Flask应用程序，并为根路径（'/'）定义了一个路由。当用户在其Web浏览器中导航到该路径时，将调用`hello()`函数，并返回字符串“Hello, World!”。
	要运行此代码，您需要在计算机上安装Flask。您可以使用以下命令使用pip安装它：

	pip install Flask

	安装Flask后，您可以使用以下命令运行代码：

	python app.py

	这将启动一个本地开发服务器，您可以使用Web浏览器访问它，方法是导航到`http://localhost:5000/`。
	您还可以通过添加其他路由和功能来进一步自定义代码。例如，您可以为不同的端点定义不同的路由，并使用请求数据执行某些操作。您还可以向应用程序添加错误处理和用户身份验证。
	```
	## Major Improvement
	- 基于open_llama_7b_v2训练，完全允许商业使用
	- 英语效果与vicuna-7b持平，中文效果好于vicuna-7b
	- 编程能力好于vicuna-7b，应该是open_llama_7b_v2用了StarCoder数据集
	- 支持langchain-MRKL格式(agent= "zero-shot-react-description")
	<br>
	- Finetuned on openllama, allowing for commercial purposes.
	- Achieves the same level of English performance as vicuna-7b and outperforms vicuna-7b in Chinese performance
	- Has better programming ability than vicuna-7b, likely due to the use of the StarCoder dataset in open_llama_7b_v2
	- Supports langchain-MRKL format(agent= "zero-shot-react-description").