Instructions to use Skywork/Skywork-13B-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Skywork/Skywork-13B-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Skywork/Skywork-13B-base", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("Skywork/Skywork-13B-base", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Skywork/Skywork-13B-base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Skywork/Skywork-13B-base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Skywork/Skywork-13B-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Skywork/Skywork-13B-base
- SGLang
How to use Skywork/Skywork-13B-base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Skywork/Skywork-13B-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Skywork/Skywork-13B-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Skywork/Skywork-13B-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Skywork/Skywork-13B-base", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Skywork/Skywork-13B-base with Docker Model Runner:
docker model run hf.co/Skywork/Skywork-13B-base
Update README.md
Browse files
README.md
CHANGED
|
@@ -107,7 +107,7 @@ We use Byte-Pair Encoding (BPE) to tokenize the data, with a vocabulary size of
|
|
| 107 |
## 领域数据困惑度评估(Perplexity Evaluaiton)
|
| 108 |
语言模型训练的本质上是让预测下一个词更准确。基于这个认知,我们认为评估基础大模型一个重要的方式是评估在各大领域上语言模型生成文章的概率。在模型训练中预测下一个词的概率一般使用Cross Entropy损失函数,整体的损失函数为每个位置预测真实词损失的平均,则有:
|
| 109 |
|
| 110 |
-
$$loss = \sum^{n}_{i=1} log(p_i) / n = log( \prod_{i=1}^n p_i) / n$$
|
| 111 |
|
| 112 |
其中$n$是文档的长度,即token数,$p_i$是位置i上真实词的概率,我们知道文档中每一个位置上真实词的概率的联乘则为生成该文档的概率,如此我们就将loss和生成文章的概率联系在了一起。而不同模型因为使用的分词器不同,具有不同的token数,因此对损失函数乘以token数目$n$,这样就仅考虑生成文章的概率部分,不同模型也可以进行比较。我们将标准化后loss取指数转换成perplexity,使得模型的差异更加可读。为了阅读方便后续提到的loss和ppl为模型标准化后的loss和perplexity。
|
| 113 |
|
|
|
|
| 107 |
## 领域数据困惑度评估(Perplexity Evaluaiton)
|
| 108 |
语言模型训练的本质上是让预测下一个词更准确。基于这个认知,我们认为评估基础大模型一个重要的方式是评估在各大领域上语言模型生成文章的概率。在模型训练中预测下一个词的概率一般使用Cross Entropy损失函数,整体的损失函数为每个位置预测真实词损失的平均,则有:
|
| 109 |
|
| 110 |
+
$$loss = -\sum^{n}_{i=1} log(p_i) / n = -log( \prod_{i=1}^n p_i) / n$$
|
| 111 |
|
| 112 |
其中$n$是文档的长度,即token数,$p_i$是位置i上真实词的概率,我们知道文档中每一个位置上真实词的概率的联乘则为生成该文档的概率,如此我们就将loss和生成文章的概率联系在了一起。而不同模型因为使用的分词器不同,具有不同的token数,因此对损失函数乘以token数目$n$,这样就仅考虑生成文章的概率部分,不同模型也可以进行比较。我们将标准化后loss取指数转换成perplexity,使得模型的差异更加可读。为了阅读方便后续提到的loss和ppl为模型标准化后的loss和perplexity。
|
| 113 |
|