YuLan-Chat: An Open-Source Bilingual Chatbot

YuLan-Chat models are chat-based large language models, which are developed by the researchers in GSAI, Renmin University of China (YuLan, which represents Yulan Magnolia, is the campus flower of Renmin University of China). The newest version is developed by continually-pretraining and instruction-tuning LLaMA-2 with high-quality English and Chinese data. The model has the following technical characteristics:
  • Due to continued pre-training on high-quality Chinese-English bilingual data, the language ability of the model has been improved.
  • To well support Chinese and longer inputs and outputs, we expand the original vocabulary with Chinese words and extend the maximum length of LLaMA-2. It can support 8k context now.
  • To well activate the bilingual instruction following capacity, we construct high-quality bilingual instructions, and perform multi-stage instruction-tuning.

YuLan-Chat็ณปๅˆ—ๆจกๅž‹ๆ˜ฏไธญๅ›ฝไบบๆฐ‘ๅคงๅญฆ้ซ˜็“ดไบบๅทฅๆ™บ่ƒฝๅญฆ้™ขๅธˆ็”Ÿๅ…ฑๅŒๅผ€ๅ‘็š„ๆ”ฏๆŒ่Šๅคฉ็š„ๅคง่ฏญ่จ€ๆจกๅž‹๏ผˆๅๅญ—"็މๅ…ฐ"ๅ–่‡ชไธญๅ›ฝไบบๆฐ‘ๅคงๅญฆๆ ก่Šฑ๏ผ‰ใ€‚ๆœ€ๆ–ฐ็‰ˆๆœฌๅŸบไบŽLLaMA-2่ฟ›่กŒไบ†ไธญ่‹ฑๆ–‡ๅŒ่ฏญ็š„็ปง็ปญ้ข„่ฎญ็ปƒๅ’ŒๆŒ‡ไปคๅพฎ่ฐƒใ€‚่ฏฅ็‰ˆๆจกๅž‹ๅ…ทๆœ‰ๅฆ‚ไธ‹ๆŠ€ๆœฏ็‰น็‚น๏ผš

  • ็”ฑไบŽๅœจ้ซ˜่ดจ้‡ไธญ่‹ฑๅŒ่ฏญๆ•ฐๆฎไธŠ่ฟ›่กŒไบ†็ปง็ปญ้ข„่ฎญ็ปƒ๏ผŒๆจกๅž‹็š„่ฏญ่จ€่ƒฝๅŠ›ๅพ—ๅˆฐๆ้ซ˜๏ผ›
  • ไธบไบ†ๆ›ดๅฅฝ็š„ๆ”ฏๆŒไธญๆ–‡ๅ’Œๆ›ด้•ฟ็š„่พ“ๅ…ฅ่พ“ๅ‡บ๏ผŒๅฏนๅŽŸ็‰ˆLLaMA-2็š„่ฏ่กจๅŠ้•ฟๅบฆ่ฟ›่กŒไบ†ๆ‰ฉๅ……๏ผŒ็›ฎๅ‰ๅฏๆ”ฏๆŒ8kไธŠไธ‹ๆ–‡๏ผ›
  • ไธบไบ†่ฎฉๆจกๅž‹ๆ›ดๅฅฝๅœฐๆœไปŽ็”จๆˆทๆŒ‡ไปค๏ผŒๆž„ๅปบไบ†้ซ˜่ดจ้‡ๅŒ่ฏญๆŒ‡ไปคๆ•ฐๆฎ้›†๏ผŒๅนถ่กŒไบ†ๅคš้˜ถๆฎตๆŒ‡ไปคๅพฎ่ฐƒใ€‚

Model Zoo

Due to the license limitation, for models based on LLaMA, we only provide the weight difference with the original checkpoints; for models based on LLaMA-2, they can be used directly. Please check the Usage section for more details.

Limitations: Despite our efforts to reduce potential security issues during the model's usage and encourage the generation of text that aligns with ethical and legal requirements, the language model is based on probabilistic generation, which means it may still produce unexpected outputs. For instance, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We do not assume any responsibility for any consequences resulting from the dissemination of harmful information.

็”ฑไบŽ่ฎธๅฏ่ฏ็š„้™ๅˆถ๏ผŒๅŸบไบŽLLaMA็š„ๆจกๅž‹ๆˆ‘ไปฌไป…ๆไพ›ไธŽๅฎ˜ๆ–นๆจกๅž‹็š„ๅทฎๅ€ผ๏ผŒๅŸบไบŽLLaMA-2็š„ๆจกๅž‹ๅฏ็›ดๆŽฅไฝฟ็”จ๏ผŒๅ…ทไฝ“่ฏทๅ‚่งไฝฟ็”จๆ–นๆณ•็ซ ่Š‚ใ€‚

ๅฑ€้™ๆ€ง๏ผšๅฐฝ็ฎกๆˆ‘ไปฌๅฐ่ฏ•ๅ‡ๅฐ‘ๆจกๅž‹ๅœจไฝฟ็”จไธญๅฏ่ƒฝๅ‡บ็Žฐ็š„ๅฎ‰ๅ…จๆ€ง้—ฎ้ข˜๏ผŒๅนถ้ผ“ๅŠฑๆจกๅž‹็”Ÿๆˆ็ฌฆๅˆ้“ๅพทๅ’Œๆณ•ๅพ‹่ฆๆฑ‚็š„ๆ–‡ๆœฌ๏ผŒไฝ†็”ฑไบŽ่ฏญ่จ€ๆจกๅž‹ๅŸบไบŽๆฆ‚็އ็”Ÿๆˆ็š„่Œƒๅผ๏ผŒๆจกๅž‹ไป็„ถๅฏ่ƒฝไผšไบง็”Ÿๆ„ๅค–็š„่พ“ๅ‡บใ€‚ ไพ‹ๅฆ‚๏ผŒ็”Ÿๆˆ็š„ๅ“ๅบ”ๅฏ่ƒฝๅŒ…ๅซๅ่งใ€ๆญง่ง†ๆˆ–ๅ…ถไป–ๆœ‰ๅฎณๅ†…ๅฎนใ€‚ ่ฏทไธ่ฆไผ ๆ’ญๆญค็ฑปๅ†…ๅฎนใ€‚ ๆˆ‘ไปฌๅฏนๅ› ไผ ๆ’ญๆœ‰ๅฎณไฟกๆฏ่€Œ้€ ๆˆ็š„ไปปไฝ•ๅŽๆžœไธๆ‰ฟๆ‹…ไปปไฝ•่ดฃไปปใ€‚

Model Backbone Extended Vocab Extended Length Continue PT SFT Released Date
YuLan-Chat-2-13B LLaMA2-13B โœ… 51,190 โœ… 8,192 โœ… โœ… 2023.8.2
YuLan-LLaMA-2-13B LLaMA2-13B โœ… 51,190 โœ… 8,192 โœ… โŒ 2023.8.2
YuLan-Chat-1-65B-v2 LLaMA-65B โœ… 51,190 โŒ 2,048 โœ… โœ… 2023.8.2
YuLan-Chat-1-13B-v1 LLaMA-13B โŒ 32,000 โŒ 2,048 โŒ โœ… 2023.6.8
YuLan-Chat-1-65B-v1 LLaMA-65B โŒ 32,000 โŒ 2,048 โŒ โœ… 2023.6.8

Evaluation

We evaluate our YuLan-Chat model on several Chinese and English benchmarks. The evaluation results are shown as follows.

ๆˆ‘ไปฌๅœจไธญ่‹ฑๆ–‡็š„ไธ€ไบ›ๅŸบๅ‡†ๆต‹่ฏ•ไธŠๅฏนYuLan-Chat่ฟ›่กŒไบ†่ฏ„ไปท๏ผŒๅ…ถ็ป“ๆžœๅฆ‚ไธ‹ใ€‚

MMLU

MMLU (Massive Multitask Language Understanding) is a benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings.

MMLUๆ˜ฏไธ€ไธช่ฏ„ไผฐๆจกๅž‹็Ÿฅ่ฏ†้‡็š„ๅธธ็”จ็š„่‹ฑๆ–‡ๅŸบๅ‡†ๆต‹่ฏ•้›†ใ€‚

Model STEM Social Science Humanities Others Avg.
YuLan-Chat-1-13B-v1 39.6 57.8 42.6 57.6 49.4
YuLan-Chat-1-65B-v1 49.2 71.7 57.7 66.7 61.3
YuLan-Chat-1-65B-v2 46.3 67.9 56.9 63.9 58.7
LLaMA-2-13B 44.6 64.2 53.9 62.2 56.2
FlagAlpha/Llama2-Chinese-13b-Chat 44.4 63.2 51.6 60.6 55.0
Linly-AI/Chinese-LLaMA-2-13B-hf 43.6 62.7 49.8 61.6 54.4
YuLan-LLaMA-2-13B 42.9 61.5 50.4 58.6 53.4
YuLan-Chat-2-13B 45.3 66.7 53.8 62.8 57.2

C-Eval

C-Eval is a comprehensive Chinese evaluation suite for foundation models.

C-Evalๆ˜ฏไธ€ไธช้’ˆๅฏนๅŸบ็Ÿณๆจกๅž‹็ปผๅˆ่ƒฝๅŠ›็š„ไธญๆ–‡ๅŸบๅ‡†ๆต‹่ฏ•้›†ใ€‚

Model STEM Social Science Humanities Others Avg. Avg. (Hard)
YuLan-Chat-1-13B-v1 30.2 37.4 31.9 30.7 32.0 25.7
YuLan-Chat-1-65B-v1 37.7 46.1 36.8 38.0 39.2 31.1
YuLan-Chat-1-65B-v2 39.9 55.9 47.7 43.7 45.4 31.4
LLaMA-2-13B 36.9 43.2 37.6 36.6 38.2 32.0
FlagAlpha/Llama2-Chinese-13b-Chat 36.8 44.5 36.3 36.5 38.1 30.9
Linly-AI/Chinese-LLaMA-2-13B-hf 33.7 44.8 36.6 36.5 37 27.7
YuLan-LLaMA-2-13B 35.3 46.4 41.9 37.6 39.3 28.6
YuLan-Chat-2-13B 38.9 49.7 45.0 40.8 42.6 32.2

AGI-Eval-Gaokao

AGI-Eval is a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving. We use the sub-branch Chinese-Gaokao for evaluation.

AGI-Eval ๆ˜ฏไธ€ไธชไปฅไบบไธบไธญๅฟƒ็š„ๅŸบๅ‡†๏ผŒไธ“้—จ่ฎพ่ฎก็”จไบŽ่ฏ„ไผฐๅŸบ็ก€ๆจกๅž‹ๅœจไธŽไบบ็ฑป่ฎค็Ÿฅๅ’Œ่งฃๅ†ณ้—ฎ้ข˜็›ธๅ…ณ็š„ไปปๅŠกไธญ็š„ไธ€่ˆฌ่ƒฝๅŠ›ใ€‚ๆˆ‘ไปฌไฝฟ็”จๅ…ถไธญ็š„"้ซ˜่€ƒ"ๅˆ†ๆ”ฏ่ฟ›่กŒ่ฏ„ๆต‹ใ€‚

Model Avg. Chinese English Geography History Biology Chemistry Physics Math-QA Math-Cloze
YuLan-Chat-1-13B-v1 24.3 22.4 60.1 27.6 25.5 21.9 30.0 8.0 21.1 1.7
YuLan-Chat-1-65B-v1 29.3 25.2 79.1 37.2 36.6 28.6 24.2 11.0 21.9 0.0
YuLan-Chat-1-65B-v2 37.9 31.4 80.4 50.8 56.6 33.3 29.0 32.0 24.4 0.8
LLaMA-2-13B 32.7 27.2 72.2 36.2 43.0 26.2 32.4 30.0 26.2 0.9
FlagAlpha/Llama2-Chinese-13b-Chat 31.6 26.4 70.6 35.2 38.7 28.1 28.0 29.5 25.6 2.5
Linly-AI/Chinese-LLaMA-2-13B-hf 31.1 22.8 74.8 42.2 37.9 24.3 28.0 23.0 26.5 0.0
YuLan-LLaMA-2-13B 34.2 25.2 70.3 43.2 48.5 30.0 29.5 31.0 28.5 1.7
YuLan-Chat-2-13B 39.5 37.0 85.3 46.7 51.9 43.8 38.2 29.0 23.1 0.9

Usage

Import from Huggingface Transformers

As our model is trained based on LLaMA, it can be loaded in the same way as original LLaMA.

็”ฑไบŽๆˆ‘ไปฌ็š„ๆจกๅž‹ๆ˜ฏๅŸบไบŽLLaMAๅผ€ๅ‘็š„๏ผŒๅฏไปฅไฝฟ็”จไธŽLLaMA็›ธๅŒ็š„ๆ–นๆณ•ๅŠ ่ฝฝใ€‚

>>> from transformers import LlamaTokenizer, LlamaForCausalLM
>>> tokenizer = LlamaTokenizer.from_pretrained("yulan-team/YuLan-Chat-2-13b")
>>> model = LlamaForCausalLM.from_pretrained("yulan-team/YuLan-Chat-2-13b").cuda()
>>> model = model.eval()
>>> input_text = "hello"
>>> prompt = "The following is a conversation between a human and an AI assistant namely YuLan, developed by GSAI, Renmin University of China. The AI assistant gives helpful, detailed, and polite answers to the user's questions.\n[|Human|]:{}\n[|AI|]:".format(input_text)
>>> inputs = tokenizer(prompt, return_tensors='pt', padding="longest", max_length=8192, truncation=True, return_attention_mask=True, add_special_tokens=True)
>>> kwargs = {'temperature': 0.8, 'top_p': 0.95, "top_k": 50, "repetition_penalty": 1.1, "no_repeat_ngram_size": 64, "max_length": 8192, "pad_token_id": tokenizer.bos_token_id, "eos_token_id": tokenizer.eos_token_id}
>>> outputs = model.generate(inputs['input_ids'].to(model.device), attention_mask=inputs['attention_mask'].to(model.device), do_sample=True, **kwargs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[len(prompt):])
Hello! How can I assist you today?

License

YuLan-Chat uses MIT License. All data and code in this project can only be used for academic purposes.

ๆœฌ้กน็›ฎไฝฟ็”จMIT่ฎธๅฏ๏ผŒๆ‰€ๆœ‰็š„ๆ•ฐๆฎๅ’Œไปฃ็ ไป…ไพ›ๅญฆๆœฏ็ ”็ฉถไฝฟ็”จใ€‚

Contributors

Reference

Please kindly cite our work if it helps you.

ๅฆ‚ๆžœๆˆ‘ไปฌ็š„้กน็›ฎๅฏนๆ‚จๆœ‰ๅธฎๅŠฉ๏ผŒ่ฏทๅผ•็”จๆˆ‘ไปฌ๏ผŒ่ฐข่ฐข๏ผ

@misc{YuLan-Chat,
  author = {YuLan-Team},
  title = {YuLan-Chat: An Open-Source Bilingual Chatbot},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/RUC-GSAI/YuLan-Chat}},
}
Downloads last month
949
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yulan-team/YuLan-Chat-2-13b-fp16

Quantizations
3 models

Spaces using yulan-team/YuLan-Chat-2-13b-fp16 31

Collection including yulan-team/YuLan-Chat-2-13b-fp16