网站功能是什么百度账户托管运营
- 作者: 五速梦信息网
- 时间: 2026年04月20日 08:08
当前位置: 首页 > news >正文
网站功能是什么,百度账户托管运营,wordpress安装插件ftp,商标起名生成器ChatGLM2-6B微调实践 环境准备安装部署1、安装 Anaconda2、安装CUDA3、安装PyTorch4、安装 ChatGLM2-6B 微调实践1、准备数据集2、安装python依赖3、微调并训练新模型4、微调后模型的推理与评估5、验证与使用微调后的模型 微调过程中遇到的问题 环境准备
申请阿里云GPU服务器
CentOS 7.6 64Anaconda3-2023.07-1-Linux-x86_64Python 3.11.5GPU NVIDIA A10显存24 G/1 coreCPU 8 vCore/30G 安装部署
1、安装 Anaconda
wget https://repo.anaconda.com/archive/Anaconda3-2023.07-1-Linux-x86_64.sh
sh Anaconda3-2023.07-1-Linux-x86_64.sh根据提示一路安装即可。
2、安装CUDA
wget https://developer.download.nvidia.com/compute/cuda/11.2.0/local_installers/cuda_11.2.0_460.27.04_linux.run
sh cuda_11.2.0_460.27.04_linux.run根据提示安装即可
3、安装PyTorch
conda install pytorch torchvision pytorch-cuda11.8 -c pytorch -c nvidia如提示找不到conda命令需配置Anaconda环境变量。
4、安装 ChatGLM2-6B
mkdir ChatGLM
cd ChatGLM
git clone https://github.com/THUDM/ChatGLM2-6B.git
cd ChatGLM2-6B
pip install -r requirements.txt加载模型需要从网上下载模型的7个分片文件总共大约10几个G大小可提前下载。
模型下载地址https://huggingface.co/THUDM/chatglm2-6b/tree/main
微调实践
1、准备数据集
准备我们自己的数据集分别生成训练文件和测试文件这两个文件放在目录 ChatGLM2-6B/ptuning/myDataset/ 下面。
训练集文件 train_file.json 测试集文件 val_file.json
2、安装python依赖
后面微调训练需要依赖一些 Python 模块提前安装一下
conda install rouge_chinese nltk jieba datasets3、微调并训练新模型
修改 train.sh 脚本文件根据自己实际情况配置即可修改后的配置为
PRE_SEQ_LEN128
LR2e-2
NUM_GPUS1torchrun –standalone –nnodes1 –nproc-per-node\(NUM_GPUS main.py \--do_train \--train_file myDataset/train_file.json \--validation_file myDataset/val_file.json \--preprocessing_num_workers 6 \--prompt_column content \--response_column summary \--overwrite_cache \--model_name_or_path /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b \--output_dir output/zhbr-chatglm2-6b-checkpoint \--overwrite_output_dir \--max_source_length 64 \--max_target_length 128 \--per_device_train_batch_size 6 \--per_device_eval_batch_size 6 \--gradient_accumulation_steps 16 \--predict_with_generate \--max_steps 20 \--logging_steps 5 \--save_steps 5 \--learning_rate \)LR --pre_seq_len \(PRE_SEQ_LEN \--quantization_bit 4修改完即可进行微调
cd /root/ChatGLM/ChatGLM2-6B/ptuning/
sh train.sh运行结果如下
(base) [rootiZbp178u8rw9n9ko94ubbyZ ptuning]# sh train.sh
[2023-10-08 13:09:12,312] torch.distributed.run: [WARNING] master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified.
10/08/2023 13:09:15 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
10/08/2023 13:09:15 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu1,
adafactorFalse,
adam_beta10.9,
adam_beta20.999,
adam_epsilon1e-08,
auto_find_batch_sizeFalse,
bf16False,
bf16_full_evalFalse,
data_seedNone,
dataloader_drop_lastFalse,
dataloader_num_workers0,
dataloader_pin_memoryTrue,
ddp_backendNone,
ddp_broadcast_buffersNone,
ddp_bucket_cap_mbNone,
ddp_find_unused_parametersNone,
ddp_timeout1800,
debug[],
deepspeedNone,
disable_tqdmFalse,
dispatch_batchesNone,
do_evalFalse,
do_predictFalse,
do_trainTrue,
eval_accumulation_stepsNone,
eval_delay0,
eval_stepsNone,
evaluation_strategyIntervalStrategy.NO,
fp16False,
fp16_backendauto,
fp16_full_evalFalse,
fp16_opt_levelO1,
fsdp[],
fsdp_config{min_num_params: 0, xla: False, xla_fsdp_grad_ckpt: False},
fsdp_min_num_params0,
fsdp_transformer_layer_cls_to_wrapNone,
full_determinismFalse,
generation_configNone,
generation_max_lengthNone,
generation_num_beamsNone,
gradient_accumulation_steps16,
gradient_checkpointingFalse,
greater_is_betterNone,
group_by_lengthFalse,
half_precision_backendauto,
hub_always_pushFalse,
hub_model_idNone,
hub_private_repoFalse,
hub_strategyHubStrategy.EVERY_SAVE,
hub_tokenHUB_TOKEN,
ignore_data_skipFalse,
include_inputs_for_metricsFalse,
jit_mode_evalFalse,
label_namesNone,
label_smoothing_factor0.0,
learning_rate0.02,
length_column_namelength,
load_best_model_at_endFalse,
local_rank0,
log_levelpassive,
log_level_replicawarning,
log_on_each_nodeTrue,
logging_diroutput/zhbr-chatglm2-6b-checkpoint/runs/Oct08_13-09-15_iZbp178u8rw9n9ko94ubbyZ,
logging_first_stepFalse,
logging_nan_inf_filterTrue,
logging_steps5,
logging_strategyIntervalStrategy.STEPS,
lr_scheduler_typeSchedulerType.LINEAR,
max_grad_norm1.0,
max_steps20,
metric_for_best_modelNone,
mp_parameters,
no_cudaFalse,
num_train_epochs3.0,
optimOptimizerNames.ADAMW_TORCH,
optim_argsNone,
output_diroutput/zhbr-chatglm2-6b-checkpoint,
overwrite_output_dirTrue,
past_index-1,
per_device_eval_batch_size6,
per_device_train_batch_size6,
predict_with_generateTrue,
prediction_loss_onlyFalse,
push_to_hubFalse,
push_to_hub_model_idNone,
push_to_hub_organizationNone,
push_to_hub_tokenPUSH_TO_HUB_TOKEN,
ray_scopelast,
remove_unused_columnsTrue,
report_to[],
resume_from_checkpointNone,
run_nameoutput/zhbr-chatglm2-6b-checkpoint,
save_on_each_nodeFalse,
save_safetensorsFalse,
save_steps5,
save_strategyIntervalStrategy.STEPS,
save_total_limitNone,
seed42,
sharded_ddp[],
skip_memory_metricsTrue,
sortish_samplerFalse,
tf32None,
torch_compileFalse,
torch_compile_backendNone,
torch_compile_modeNone,
torchdynamoNone,
tpu_metrics_debugFalse,
tpu_num_coresNone,
use_cpuFalse,
use_ipexFalse,
use_legacy_prediction_loopFalse,
use_mps_deviceFalse,
warmup_ratio0.0,
warmup_steps0,
weight_decay0.0,
)
10/08/2023 13:09:16 - WARNING - datasets.builder - Found cached dataset json (/root/.cache/huggingface/datasets/json/default-8e52c57dfec9ef61/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:0000:00, 1379.71it/s]
[INFO|configuration_utils.py:713] 2023-10-08 13:09:16,749 loading configuration file /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b/config.json
[INFO|configuration_utils.py:713] 2023-10-08 13:09:16,751 loading configuration file /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b/config.json
[INFO|configuration_utils.py:775] 2023-10-08 13:09:16,751 Model config ChatGLMConfig {_name_or_path: /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b,add_bias_linear: false,add_qkv_bias: true,apply_query_key_layer_scaling: true,apply_residual_connection_post_layernorm: false,architectures: [ChatGLMModel],attention_dropout: 0.0,attention_softmax_in_fp32: true,auto_map: {AutoConfig: configuration_chatglm.ChatGLMConfig,AutoModel: modeling_chatglm.ChatGLMForConditionalGeneration,AutoModelForCausalLM: modeling_chatglm.ChatGLMForConditionalGeneration,AutoModelForSeq2SeqLM: modeling_chatglm.ChatGLMForConditionalGeneration,AutoModelForSequenceClassification: modeling_chatglm.ChatGLMForSequenceClassification},bias_dropout_fusion: true,classifier_dropout: null,eos_token_id: 2,ffn_hidden_size: 13696,fp32_residual_connection: false,hidden_dropout: 0.0,hidden_size: 4096,kv_channels: 128,layernorm_epsilon: 1e-05,model_type: chatglm,multi_query_attention: true,multi_query_group_num: 2,num_attention_heads: 32,num_layers: 28,original_rope: true,pad_token_id: 0,padded_vocab_size: 65024,post_layer_norm: true,pre_seq_len: null,prefix_projection: false,quantization_bit: 0,rmsnorm: true,seq_length: 32768,tie_word_embeddings: false,torch_dtype: float16,transformers_version: 4.32.1,use_cache: true,vocab_size: 65024
}[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:09:16,752 loading file tokenizer.model
[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:09:16,752 loading file added_tokens.json
[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:09:16,752 loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:09:16,753 loading file tokenizer_config.json
[INFO|modeling_utils.py:2776] 2023-10-08 13:09:16,832 loading weights file /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b/pytorch_model.bin.index.json
[INFO|configuration_utils.py:768] 2023-10-08 13:09:16,833 Generate config GenerationConfig {_from_model_config: true,eos_token_id: 2,pad_token_id: 0,transformers_version: 4.32.1
}Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:0500:00, 1.39it/s]
[INFO|modeling_utils.py:3551] 2023-10-08 13:09:21,906 All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.[WARNING|modeling_utils.py:3553] 2023-10-08 13:09:21,906 Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b and are newly initialized: [transformer.prefix_encoder.embedding.weight]
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:3136] 2023-10-08 13:09:21,908 Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
input_ids [64790, 64792, 790, 30951, 517, 30910, 30939, 30996, 13, 13, 54761, 31211, 55046, 54766, 36989, 38724, 54643, 31962, 13, 13, 55437, 31211, 30910, 30939, 31201, 54675, 54592, 33933, 31211, 31779, 32804, 51962, 31201, 39510, 57517, 56689, 31201, 48981, 57486, 55014, 31201, 55568, 56528, 55082, 54831, 54609, 54659, 30943, 31201, 35066, 54642, 36989, 31211, 31779, 35066, 54642, 56042, 55662, 31201, 54539, 56827, 31201, 55422, 54639, 55534, 31201, 33576, 57062, 54848, 31201, 55662, 55816, 41670, 39305, 33760, 36989, 54659, 30966, 31201, 32531, 31838, 54643, 31668, 31687, 31211, 31779, 32531, 31838, 33853, 31201, 32077, 43641, 31201, 54933, 55194, 32366, 32531, 49729, 39305, 33760, 36989, 54659, 30972, 31201, 31641, 48655, 31211, 31779, 36293, 54535, 32155, 31201, 45561, 54585, 31940, 54535, 32155, 31201, 54962, 55478, 54535, 32155, 54609, 31641, 31746, 31639, 31123, 32023, 54603, 36989, 55045, 58286, 49539, 31639, 31123, 36128, 33423, 32077, 36989, 31155, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
inputs [Round 1]问配网故障类别及原因答 1、外力破坏包括车辆撞击、树木刮擦、风筝坠落、倒杆断线等2、季节性故障包括季节性覆冰、大雾、雨加雪、温度骤变、冰灾等因素导致的线路故障3、施工质量及技术方面包括施工质量不良、设备老化、未按规范施工等原因导致的线路故障4、管理不到位包括巡视不及时、发现问题后处理不及时、消缺不及时等管理上的问题导致小故障积攒成大问题进而引发设备故障。
label_ids [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 30910, 30939, 31201, 54675, 54592, 33933, 31211, 31779, 32804, 51962, 31201, 39510, 57517, 56689, 31201, 48981, 57486, 55014, 31201, 55568, 56528, 55082, 54831, 54609, 54659, 30943, 31201, 35066, 54642, 36989, 31211, 31779, 35066, 54642, 56042, 55662, 31201, 54539, 56827, 31201, 55422, 54639, 55534, 31201, 33576, 57062, 54848, 31201, 55662, 55816, 41670, 39305, 33760, 36989, 54659, 30966, 31201, 32531, 31838, 54643, 31668, 31687, 31211, 31779, 32531, 31838, 33853, 31201, 32077, 43641, 31201, 54933, 55194, 32366, 32531, 49729, 39305, 33760, 36989, 54659, 30972, 31201, 31641, 48655, 31211, 31779, 36293, 54535, 32155, 31201, 45561, 54585, 31940, 54535, 32155, 31201, 54962, 55478, 54535, 32155, 54609, 31641, 31746, 31639, 31123, 32023, 54603, 36989, 55045, 58286, 49539, 31639, 31123, 36128, 33423, 32077, 36989, 31155, 2, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100]
labels 1、外力破坏包括车辆撞击、树木刮擦、风筝坠落、倒杆断线等2、季节性故障包括季节性覆冰、大雾、雨加雪、温度骤变、冰灾等因素导致的线路故障3、施工质量及技术方面包括施工质量不良、设备老化、未按规范施工等原因导致的线路故障4、管理不到位包括巡视不及时、发现问题后处理不及时、消缺不及时等管理上的问题导致小故障积攒成大问题进而引发设备故障。
[INFO|trainer.py:565] 2023-10-08 13:09:26,290 max_steps is given, it will override any value given in num_train_epochs
[INFO|trainer.py:1714] 2023-10-08 13:09:26,460 ***** Running training *****
[INFO|trainer.py:1715] 2023-10-08 13:09:26,460 Num examples 17
[INFO|trainer.py:1716] 2023-10-08 13:09:26,460 Num Epochs 20
[INFO|trainer.py:1717] 2023-10-08 13:09:26,460 Instantaneous batch size per device 6
[INFO|trainer.py:1720] 2023-10-08 13:09:26,460 Total train batch size (w. parallel, distributed accumulation) 96
[INFO|trainer.py:1721] 2023-10-08 13:09:26,460 Gradient Accumulation steps 16
[INFO|trainer.py:1722] 2023-10-08 13:09:26,460 Total optimization steps 20
[INFO|trainer.py:1723] 2023-10-08 13:09:26,460 Number of trainable parameters 1,835,0080%| | 0/20 [00:00?, ?it/s]10/08/2023 13:09:26 - WARNING - transformers_modules.chatglm2-6b.modeling_chatglm - use_cacheTrue is incompatible with gradient checkpointing. Setting use_cacheFalse...
/root/anaconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrantTrue or use_reentrantFalse explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrantTrue. It is recommended that you use use_reentrantFalse. Refer to docs for more details on the differences between the two variants.warnings.warn(
{loss: 0.5058, learning_rate: 0.015, epoch: 5.0} 25%|██████████████████████████████████████████████▎ | 5/20 [00:2100:56, 3.77s/it]Saving PrefixEncoder
[INFO|configuration_utils.py:460] 2023-10-08 13:09:47,797 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-5/config.json
[INFO|configuration_utils.py:544] 2023-10-08 13:09:47,797 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-5/generation_config.json
[INFO|modeling_utils.py:1953] 2023-10-08 13:09:47,805 Model weights saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-5/pytorch_model.bin
[INFO|tokenization_utils_base.py:2235] 2023-10-08 13:09:47,805 tokenizer config file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-5/tokenizer_config.json
[INFO|tokenization_utils_base.py:2242] 2023-10-08 13:09:47,807 Special tokens file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-5/special_tokens_map.json
/root/anaconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrantTrue or use_reentrantFalse explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrantTrue. It is recommended that you use use_reentrantFalse. Refer to docs for more details on the differences between the two variants.warnings.warn(
{loss: 0.2925, learning_rate: 0.01, epoch: 9.0} 50%|████████████████████████████████████████████████████████████████████████████████████████████ | 10/20 [00:3400:31, 3.17s/it]Saving PrefixEncoder
[INFO|configuration_utils.py:460] 2023-10-08 13:10:01,413 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-10/config.json
[INFO|configuration_utils.py:544] 2023-10-08 13:10:01,413 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-10/generation_config.json
[INFO|modeling_utils.py:1953] 2023-10-08 13:10:01,419 Model weights saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-10/pytorch_model.bin
[INFO|tokenization_utils_base.py:2235] 2023-10-08 13:10:01,420 tokenizer config file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-10/tokenizer_config.json
[INFO|tokenization_utils_base.py:2242] 2023-10-08 13:10:01,420 Special tokens file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-10/special_tokens_map.json
/root/anaconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrantTrue or use_reentrantFalse explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrantTrue. It is recommended that you use use_reentrantFalse. Refer to docs for more details on the differences between the two variants.warnings.warn(
{loss: 0.2593, learning_rate: 0.005, epoch: 13.0} 75%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ | 15/20 [00:4800:14, 2.93s/it]Saving PrefixEncoder
[INFO|configuration_utils.py:460] 2023-10-08 13:10:15,139 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-15/config.json
[INFO|configuration_utils.py:544] 2023-10-08 13:10:15,139 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-15/generation_config.json
[INFO|modeling_utils.py:1953] 2023-10-08 13:10:15,146 Model weights saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-15/pytorch_model.bin
[INFO|tokenization_utils_base.py:2235] 2023-10-08 13:10:15,146 tokenizer config file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-15/tokenizer_config.json
[INFO|tokenization_utils_base.py:2242] 2023-10-08 13:10:15,146 Special tokens file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-15/special_tokens_map.json
/root/anaconda3/lib/python3.11/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrantTrue or use_reentrantFalse explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrantTrue. It is recommended that you use use_reentrantFalse. Refer to docs for more details on the differences between the two variants.warnings.warn(
{loss: 0.3026, learning_rate: 0.0, epoch: 18.0}
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [01:0500:00, 3.35s/it]Saving PrefixEncoder
[INFO|configuration_utils.py:460] 2023-10-08 13:10:32,333 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-20/config.json
[INFO|configuration_utils.py:544] 2023-10-08 13:10:32,333 Configuration saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-20/generation_config.json
[INFO|modeling_utils.py:1953] 2023-10-08 13:10:32,340 Model weights saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-20/pytorch_model.bin
[INFO|tokenization_utils_base.py:2235] 2023-10-08 13:10:32,340 tokenizer config file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-20/tokenizer_config.json
[INFO|tokenization_utils_base.py:2242] 2023-10-08 13:10:32,340 Special tokens file saved in output/zhbr-chatglm2-6b-checkpoint/checkpoint-20/special_tokens_map.json
[INFO|trainer.py:1962] 2023-10-08 13:10:32,354 Training completed. Do not forget to share your model on huggingface.co/models ){train_runtime: 65.8941, train_samples_per_second: 29.138, train_steps_per_second: 0.304, train_loss: 0.3400604248046875, epoch: 18.0}
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [01:0500:00, 3.29s/it]
***** train metrics *****epoch 18.0train_loss 0.3401train_runtime 0:01:05.89train_samples 17train_samples_per_second 29.138train_steps_per_second 0.3044、微调后模型的推理与评估
对微调后的模型进行评估验证修改 evaluate.sh 脚本中的 checkpoint 目录
PRE_SEQ_LEN128
CHECKPOINTzhbr-chatglm2-6b-checkpoint
STEP20
NUM_GPUS1torchrun --standalone --nnodes1 --nproc-per-node\)NUM_GPUS main.py --do_predict --validation_file myDataset/train_file.json --test_file myDataset/val_file.json --overwrite_cache --prompt_column content --response_column summary --model_name_or_path /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b --ptuning_checkpoint ./output/\(CHECKPOINT/checkpoint-\)STEP --output_dir ./output/\(CHECKPOINT \--overwrite_output_dir \--max_source_length 64 \--max_target_length 64 \--per_device_eval_batch_size 1 \--predict_with_generate \--pre_seq_len \)PRE_SEQ_LEN --quantization_bit 4对微调后的模型进行推理和评估
/root/ChatGLM/ChatGLM2-6B/ptuning/
sh evaluate.sh运行结果如下
(base) [rootiZbp178u8rw9n9ko94ubbyZ ptuning]# sh evaluate.sh
[2023-10-08 13:19:53,448] torch.distributed.run: [WARNING] master_addr is only used for static rdzv_backend and when rdzv_endpoint is not specified.
10/08/2023 13:19:56 - WARNING - main - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
10/08/2023 13:19:56 - INFO - main - Training/evaluation parameters Seq2SeqTrainingArguments(
_n_gpu1,
adafactorFalse,
adam_beta10.9,
adam_beta20.999,
adam_epsilon1e-08,
auto_find_batch_sizeFalse,
bf16False,
bf16_full_evalFalse,
data_seedNone,
dataloader_drop_lastFalse,
dataloader_num_workers0,
dataloader_pin_memoryTrue,
ddp_backendNone,
ddp_broadcast_buffersNone,
ddp_bucket_cap_mbNone,
ddp_find_unused_parametersNone,
ddp_timeout1800,
debug[],
deepspeedNone,
disable_tqdmFalse,
dispatch_batchesNone,
do_evalFalse,
do_predictTrue,
do_trainFalse,
eval_accumulation_stepsNone,
eval_delay0,
eval_stepsNone,
evaluation_strategyIntervalStrategy.NO,
fp16False,
fp16_backendauto,
fp16_full_evalFalse,
fp16_opt_levelO1,
fsdp[],
fsdp_config{min_num_params: 0, xla: False, xla_fsdp_grad_ckpt: False},
fsdp_min_num_params0,
fsdp_transformer_layer_cls_to_wrapNone,
full_determinismFalse,
generation_configNone,
generation_max_lengthNone,
generation_num_beamsNone,
gradient_accumulation_steps1,
gradient_checkpointingFalse,
greater_is_betterNone,
group_by_lengthFalse,
half_precision_backendauto,
hub_always_pushFalse,
hub_model_idNone,
hub_private_repoFalse,
hub_strategyHubStrategy.EVERY_SAVE,
hub_tokenHUB_TOKEN,
ignore_data_skipFalse,
include_inputs_for_metricsFalse,
jit_mode_evalFalse,
label_namesNone,
label_smoothing_factor0.0,
learning_rate5e-05,
length_column_namelength,
load_best_model_at_endFalse,
local_rank0,
log_levelpassive,
log_level_replicawarning,
log_on_each_nodeTrue,
logging_dir./output/zhbr-chatglm2-6b-checkpoint/runs/Oct08_13-19-56_iZbp178u8rw9n9ko94ubbyZ,
logging_first_stepFalse,
logging_nan_inf_filterTrue,
logging_steps500,
logging_strategyIntervalStrategy.STEPS,
lr_scheduler_typeSchedulerType.LINEAR,
max_grad_norm1.0,
max_steps-1,
metric_for_best_modelNone,
mp_parameters,
no_cudaFalse,
num_train_epochs3.0,
optimOptimizerNames.ADAMW_TORCH,
optim_argsNone,
output_dir./output/zhbr-chatglm2-6b-checkpoint,
overwrite_output_dirTrue,
past_index-1,
per_device_eval_batch_size1,
per_device_train_batch_size8,
predict_with_generateTrue,
prediction_loss_onlyFalse,
push_to_hubFalse,
push_to_hub_model_idNone,
push_to_hub_organizationNone,
push_to_hub_tokenPUSH_TO_HUB_TOKEN,
ray_scopelast,
remove_unused_columnsTrue,
report_to[],
resume_from_checkpointNone,
run_name./output/zhbr-chatglm2-6b-checkpoint,
save_on_each_nodeFalse,
save_safetensorsFalse,
save_steps500,
save_strategyIntervalStrategy.STEPS,
save_total_limitNone,
seed42,
sharded_ddp[],
skip_memory_metricsTrue,
sortish_samplerFalse,
tf32None,
torch_compileFalse,
torch_compile_backendNone,
torch_compile_modeNone,
torchdynamoNone,
tpu_metrics_debugFalse,
tpu_num_coresNone,
use_cpuFalse,
use_ipexFalse,
use_legacy_prediction_loopFalse,
use_mps_deviceFalse,
warmup_ratio0.0,
warmup_steps0,
weight_decay0.0,
)
Downloading and preparing dataset json/default to /root/.cache/huggingface/datasets/json/default-98f5c44ca2dd481e/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4…
Downloading data files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2⁄2 [00:0000:00, 17623.13it/s]
Extracting data files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2⁄2 [00:0000:00, 3012.07it/s]
Dataset json downloaded and prepared to /root/.cache/huggingface/datasets/json/default-98f5c44ca2dd481e/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4. Subsequent calls will reuse this data.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2⁄2 [00:0000:00, 1488.66it/s]
[INFO|configuration_utils.py:713] 2023-10-08 13:19:57,908 loading configuration file /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b/config.json
[INFO|configuration_utils.py:713] 2023-10-08 13:19:57,909 loading configuration file /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b/config.json
[INFO|configuration_utils.py:775] 2023-10-08 13:19:57,910 Model config ChatGLMConfig {_name_or_path: /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b,add_bias_linear: false,add_qkv_bias: true,apply_query_key_layer_scaling: true,apply_residual_connection_post_layernorm: false,architectures: [ChatGLMModel],attention_dropout: 0.0,attention_softmax_in_fp32: true,auto_map: {AutoConfig: configuration_chatglm.ChatGLMConfig,AutoModel: modeling_chatglm.ChatGLMForConditionalGeneration,AutoModelForCausalLM: modeling_chatglm.ChatGLMForConditionalGeneration,AutoModelForSeq2SeqLM: modeling_chatglm.ChatGLMForConditionalGeneration,AutoModelForSequenceClassification: modeling_chatglm.ChatGLMForSequenceClassification},bias_dropout_fusion: true,classifier_dropout: null,eos_token_id: 2,ffn_hidden_size: 13696,fp32_residual_connection: false,hidden_dropout: 0.0,hidden_size: 4096,kv_channels: 128,layernorm_epsilon: 1e-05,model_type: chatglm,multi_query_attention: true,multi_query_group_num: 2,num_attention_heads: 32,num_layers: 28,original_rope: true,pad_token_id: 0,padded_vocab_size: 65024,post_layer_norm: true,pre_seq_len: null,prefix_projection: false,quantization_bit: 0,rmsnorm: true,seq_length: 32768,tie_word_embeddings: false,torch_dtype: float16,transformers_version: 4.32.1,use_cache: true,vocab_size: 65024
}[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:19:57,911 loading file tokenizer.model
[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:19:57,911 loading file added_tokens.json
[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:19:57,911 loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:1850] 2023-10-08 13:19:57,911 loading file tokenizer_config.json
[INFO|modeling_utils.py:2776] 2023-10-08 13:19:57,988 loading weights file /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b/pytorch_model.bin.index.json
[INFO|configuration_utils.py:768] 2023-10-08 13:19:57,989 Generate config GenerationConfig {_from_model_config: true,eos_token_id: 2,pad_token_id: 0,transformers_version: 4.32.1
}Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7⁄7 [00:0400:00, 1.41it/s]
[INFO|modeling_utils.py:3551] 2023-10-08 13:20:02,988 All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.[WARNING|modeling_utils.py:3553] 2023-10-08 13:20:02,988 Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b and are newly initialized: [transformer.prefix_encoder.embedding.weight]
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:3136] 2023-10-08 13:20:02,989 Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
input_ids [64790, 64792, 790, 30951, 517, 30910, 30939, 30996, 13, 13, 54761, 31211, 55046, 54848, 55623, 55279, 36989, 13, 13, 55437, 31211]
inputs [Round 1]问配变雷击故障答
label_ids [64790, 64792, 30910, 55623, 54710, 31921, 55279, 54538, 55046, 38754, 33760, 54746, 32077, 31123, 32023, 33760, 41711, 31201, 32077, 55870, 56544, 35978, 31155]
labels 雷电直接击中配电网线路或设备导致线路损坏、设备烧毁等问题。
10/08/2023 13:20:06 - INFO - main - *** Predict ***
[INFO|trainer.py:3119] 2023-10-08 13:20:06,946 ***** Running Prediction *****
[INFO|trainer.py:3121] 2023-10-08 13:20:06,946 Num examples 2
[INFO|trainer.py:3124] 2023-10-08 13:20:06,946 Batch size 1
[INFO|configuration_utils.py:768] 2023-10-08 13:20:06,949 Generate config GenerationConfig {_from_model_config: true,eos_token_id: 2,pad_token_id: 0,transformers_version: 4.32.1
}0%| | 0/2 [00:00?, ?it/s][INFO|configuration_utils.py:768] 2023-10-08 13:20:11,223 Generate config GenerationConfig {_from_model_config: true,eos_token_id: 2,pad_token_id: 0,transformers_version: 4.32.1
}100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2⁄2 [00:0100:00, 1.09it/s]Building prefix dict from the default dictionary …
10/08/2023 13:20:13 - DEBUG - jieba - Building prefix dict from the default dictionary …
Dumping model to file cache /tmp/jieba.cache
10/08/2023 13:20:13 - DEBUG - jieba - Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.440 seconds.
10/08/2023 13:20:13 - DEBUG - jieba - Loading model cost 0.440 seconds.
Prefix dict has been built successfully.
10/08/2023 13:20:13 - DEBUG - jieba - Prefix dict has been built successfully.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2⁄2 [00:0200:00, 1.14s/it]
***** predict metrics *****predict_bleu-4 10.9723predict_rouge-1 44.0621predict_rouge-2 11.9047predict_rouge-l 33.5968predict_runtime 0:00:06.56predict_samples 2predict_samples_per_second 0.305predict_steps_per_second 0.3055、验证与使用微调后的模型
方法一
编写python脚本加载微调训练后生成的 Checkpoint 路径
from transformers import AutoConfig, AutoModel, AutoTokenizer
import os
import torch
载入Tokenizer
tokenizer AutoTokenizer.from_pretrained(THUDM/chatglm2-6b, trust_remote_codeTrue) config AutoConfig.from_pretrained(THUDM/chatglm2-6b, trust_remote_codeTrue, pre_seq_len128) model AutoModel.from_pretrained(THUDM/chatglm2-6b, configconfig, trust_remote_codeTrue) prefix_state_dict torch.load(os.path.join(./output/zhbr-chatglm2-6b-checkpoint/checkpoint-20, pytorch_model.bin)) new_prefix_state_dict {} for k, v in prefix_state_dict.items():if k.startswith(transformer.prefix_encoder.):new_prefix_state_dict[k[len(transformer.prefix_encoder.):]] v model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)# Comment out the following line if you dont use quantization model model.quantize(4) #或者8 model model.half().cuda() model.transformer.prefix_encoder.float() model model.eval()response, history model.chat(tokenizer, 配网线路故障有哪些, history[]) print(response)方法二 修改ptuning中的web_demo.sh根据自己实际情况配置 PRE_SEQ_LEN128CUDA_VISIBLE_DEVICES0 python3 web_demo.py --model_name_or_path /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b --ptuning_checkpoint output/zhbr-chatglm2-6b-checkpoint/checkpoint-20 --pre_seq_len $PRE_SEQ_LEN执行web_demo.sh访问http://xxx.xxx.xxx.xxx:7860。 (base) [rootiZbp178u8rw9n9ko94ubbyZ ptuning]# sh web_demo.sh /root/ChatGLM/ChatGLM2-6B-main/ptuning/web_demo.py:101: GradioDeprecationWarning: The style method is deprecated. Please set these arguments in the constructor instead.user_input gr.Textbox(show_labelFalse, placeholderInput…, lines10).style( Loading prefix_encoder weight from output/zhbr-chatglm2-6b-checkpoint/checkpoint-20 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7⁄7 [00:0400:00, 1.47it/s] Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at /root/ChatGLM/ChatGLM2-6B-main/zhbr/chatglm2-6b and are newly initialized: [transformer.prefix_encoder.embedding.weight] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Running on local URL: http://127.0.0.1:7860To create a public link, set shareTrue in launch().微调过程中遇到的问题 报错信息如下 dataclasses.FrozenInstanceError: cannot assign to field generation_max_length 和 dataclasses.FrozenInstanceError: cannot assign to field generation_num_beams解决方法 在main.py文件中注释掉以下代码。
- 上一篇: 网站功能设计方案网站输入卡密提取怎么做
- 下一篇: 网站功能项目报价那么多网站都是谁做的
相关文章
-
网站功能设计方案网站输入卡密提取怎么做
网站功能设计方案网站输入卡密提取怎么做
- 技术栈
- 2026年04月20日
-
网站功能模块结构图怎么自己写代码做网站
网站功能模块结构图怎么自己写代码做网站
- 技术栈
- 2026年04月20日
-
网站功能简介淘宝优惠券查询网站怎么做
网站功能简介淘宝优惠券查询网站怎么做
- 技术栈
- 2026年04月20日
-
网站功能项目报价那么多网站都是谁做的
网站功能项目报价那么多网站都是谁做的
- 技术栈
- 2026年04月20日
-
网站功能需求用什么做更改网站logo地址
网站功能需求用什么做更改网站logo地址
- 技术栈
- 2026年04月20日
-
网站构成的基本结构报告网站开发环境
网站构成的基本结构报告网站开发环境
- 技术栈
- 2026年04月20日
