首先hf上是存在对国家歧视的,部分需要申请访问资格的repo会自动拒绝国内的账户(说的就是你LLaMA),这种情况有两种解决办法,一是魔塔上下载模型,另一种是从hf上其他用户手里下载(但是这种情况可能需要修改你下载好的仓库中的json文件才能用transformers的api使用)。同时国内下载模型的速度感人,可以切换国内的镜像来下载。下载前最好指定cache路径,否则你想找模型很难得啦,当然下载好的模型路径其实是有点奇怪的,最好的办法是通过transformer加载,然后保存一份,这样的目录下是很干净的模型。Let’s see see!

国内镜像

1
export HF_ENDPOINT=https://hf-mirror.com

Cache位置

1
export HF_HOME=/path/to/your/local_dir

python代码下载完整仓库

1
2
3
4
5
6
7
8
9
10
11
12
13
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
from huggingface_hub import snapshot_download

def download_from_huggingface(model_name):

print(f" Downloading \033[32m\033[1m {model_name} \033[0m from huggingface ... ")

model_path = snapshot_download(repo_id=model_name) # revision=transformers.__version__ 可以指定下载版本, 但是transformers版本并不能作为仓库版本来用

print(f"\033[32m\033[1m {model_name} \033[0m has been downloaded into \033[34m\033[5m {model_path} \033[0m")

return model_path

之后你会在Cache目录下看到一些hf相关的文件,而模型会在上面打印的路径下。当然,打开后你会发现这里的文件全是软链接,并且不见得是你当前transformers对应的版本的内容。而总所周知,transformers不同版本中的同一个模型的实现是会变的,例如GLM4(在4.49版本下是继承llama的搭框架以及Phi的MLP func, 但是4.46中是其他版本,当然目前我也没法确定他们最终的推理效果是不是一致)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
models--Qwen--Qwen2.5-0.5B-Instruct/
├── blobs
│   ├── 07bfe0640cb5a0037f9322287fbfc682806cf672
│   ├── 0dbb161213629a23f0fc00ef286e6b1e366d180f
│   ├── 20024bfe7c83998e9aeaf98a0cd6a2ce6306c2f0
│   ├── 443909a61d429dff23010e5bddd28ff530edda00
│   ├── 4783fe10ac3adce15ac8f358ef5462739852c569
│   ├── 4b8373851d093eb9f3017443f27781c6971eff24
│   ├── 6634c8cc3133b3848ec74b9f275acaaa1ea618ab
│   ├── a6344aac8c09253b3b630fb776ae94478aa0275b
│   ├── dfc11073787daf1b0f9c0f1499487ab5f4c93738
│   └── fdf756fa7fcbe7404d5c60e26bff1a0c8b8aa1f72ced49e7dd0210fe288fb7fe
├── refs
│   └── main
└── snapshots
└── 7ae557604adf67be50417f59c2c2f167def9a775
├── config.json -> ../../blobs/0dbb161213629a23f0fc00ef286e6b1e366d180f
├── generation_config.json -> ../../blobs/dfc11073787daf1b0f9c0f1499487ab5f4c93738
├── LICENSE -> ../../blobs/6634c8cc3133b3848ec74b9f275acaaa1ea618ab
├── merges.txt -> ../../blobs/20024bfe7c83998e9aeaf98a0cd6a2ce6306c2f0
├── model.safetensors -> ../../blobs/fdf756fa7fcbe7404d5c60e26bff1a0c8b8aa1f72ced49e7dd0210fe288fb7fe
├── README.md -> ../../blobs/4b8373851d093eb9f3017443f27781c6971eff24
├── tokenizer_config.json -> ../../blobs/07bfe0640cb5a0037f9322287fbfc682806cf672
├── tokenizer.json -> ../../blobs/443909a61d429dff23010e5bddd28ff530edda00
└── vocab.json -> ../../blobs/4783fe10ac3adce15ac8f358ef5462739852c569

${HF_HOME}/hub/models–Qwen–Qwen2.5-0.5B-Instruct/snapshots/7ae557604adf67be50417f59c2c2f167def9a775这个目录就是你当前下载的repo了,可以看下config.json 中的参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
{
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 896,
"initializer_range": 0.02,
"intermediate_size": 4864,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 14,
"num_hidden_layers": 24,
"num_key_value_heads": 2,
"rms_norm_eps": 1e-06,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"transformers_version": "4.43.1",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

"transformers_version": "4.43.1" ??
这里可以通过另一种方式将他修改成对应transformers版本的内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
from huggingface_hub import snapshot_download
def download_from_huggingface(model_api, tokenizer_api, model_name):
print(f" Downloading \033[32m\033[1m {model_name} \033[0m from huggingface ... ")
model_dir = os.path.join(os.path.dirname(__file__), model_name)
model_path = snapshot_download(repo_id=model_name)

try:
model = model_api.from_pretrained(model_path, trust_remote_code=True)
tokenizer = tokenizer_api.from_pretrained(model_path, trust_remote_code=True)
except Exception as e:
raise os.error(f"\033[31m Download {model_name} has error. Make sure it's a valid repository. Or check your network!\033[0m")

model.save_pretrained(model_dir)
tokenizer.save_pretrained(model_dir)

print(f"\033[32m\033[1m {model_name} \033[0m has been downloaded into \033[34m\033[5m {model_dir} \033[0m")

return model_dir

然后你就会发现这个新的路径下的config.json 就会编程当前transformers的版本了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
{
"_name_or_path": "/data/huggingface_cache/hub/models--Qwen--Qwen2.5-0.5B-Instruct/snapshots/7ae557604adf67be50417f59c2c2f167def9a775",
"architectures": [
"Qwen2ForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 151643,
"eos_token_id": 151645,
"hidden_act": "silu",
"hidden_size": 896,
"initializer_range": 0.02,
"intermediate_size": 4864,
"max_position_embeddings": 32768,
"max_window_layers": 21,
"model_type": "qwen2",
"num_attention_heads": 14,
"num_hidden_layers": 24,
"num_key_value_heads": 2,
"rms_norm_eps": 1e-06,
"rope_scaling": null,
"rope_theta": 1000000.0,
"sliding_window": 32768,
"tie_word_embeddings": true,
"torch_dtype": "float32",
"transformers_version": "4.49.0",
"use_cache": true,
"use_sliding_window": false,
"vocab_size": 151936
}

结束语

通过上面的一系列操作就可以得到你想要的模型,解决模型版本的问题,这种方法同样可以修改你已经下载了的模型,让他的参数和你的transformers版本对齐。

撒花!