首页 - 技术栈

网站自助授权系统wordpress插件更新

作者: 五速梦信息网
时间: 2026年06月19日 07:20

当前位置：首页 > news >正文

网站自助授权系统,wordpress插件更新,c2c电商平台网站,玉林市网站建设一、GRU 1.概念 GRU#xff08;门控循环单元#xff0c;Gated Recurrent Unit#xff09;是一种循环神经网络#xff08;RNN#xff09;的变体#xff0c;旨在解决标准 RNN 在处理长期依赖关系时遇到的梯度消失问题。GRU 通过引入门控机制简化了 LSTM#xff08;长短期…一、GRU 1.概念 GRU门控循环单元Gated Recurrent Unit是一种循环神经网络RNN的变体旨在解决标准 RNN 在处理长期依赖关系时遇到的梯度消失问题。GRU 通过引入门控机制简化了 LSTM长短期记忆网络的设计使得模型更轻便同时保留了 LSTM 的优点。 2.原理 2.1.两个重大改进 1.将输入门、遗忘门、输出门三个门变为更新门Updata Gate和重置门Reset Gate两个门。 2.将 (候选) 单元状态与隐藏状态 (输出) 合并即只有当前时刻候选隐藏状态和当前时刻隐藏状态。 2.2模型结构简化图内部结构 GRU通过其门控机制能够有效地捕捉到序列数据中的时间动态同时相较于LSTM来说由于其结构更加简洁通常参数更少计算效率更高。 2.2.1 重置门重置门决定在计算当前候选隐藏状态时忽略多少过去的信息。 2.2.2 更新门更新门决定了多少过去的信息将被保留。它使用前一时间步的隐藏状态 ( h_{t-1} ) 和当前输入 ( x_t ) 来计算得出。 2.2.3 候选隐藏状态候选隐藏状态是当前时间步的建议更新它包含了当前输入和过去的隐藏状态的信息。重置门的作用体现在它可以允许模型抛弃或保留之前的隐藏状态。 2.2.4 最终隐藏状态最终隐藏状态是通过融合过去的隐藏状态和当前候选隐藏状态来计算得出的。更新门控制了融合过去信息和当前信息的比例。忘记传递下来的中的某些信息并加入当前节点输入的某些信息。这就是最终的记忆。

代码实现 3.1 原生代码 import numpy as np class GRU:def init(self, input_size, hidden_size):self.input_size input_sizeself.hidden_size hidden_size# 初始化w和b 更新门self.W_z np.random.rand(hidden_size, input_size hidden_size)self.b_z np.zeros(hidden_size)#重置门self.W_r np.random.rand(hidden_size, input_size hidden_size)self.b_r np.zeros(hidden_size)#候选隐藏状态self.W_h np.random.rand(hidden_size, input_size hidden_size)self.b_h np.zeros(hidden_size)def tanh(self, x):return np.tanh(x)def sigmoid(self, x):return 1 / (1 np.exp(-x))def forward(self, x):#初始化隐藏状态h_prevnp.zeros((self.hidden_size,))concat_inputnp.concatenate([x, h_prev],axis0) z_tself.sigmoid(np.dot(self.W_z,concat_input)self.b_z)r_tself.sigmoid(np.dot(self.W_r,concat_input)self.b_r) concat_reset_inputnp.concatenate([x,r_t*h_prev],axis0)h_hat_tself.tanh(np.dot(self.W_h,concat_reset_input)self.b_h) h_t(1-z_t)*h_prevz_t*h_hat_t return h_t

测试数据

input_size3 hidden_size2 seq_len4 xnp.random.randn(seq_len,input_size) gruGRU(input_size,hidden_size) all_h[] for t in range(seq_len):h_tgru.forward(x[t,:])all_h.append(h_t)print(h_t.shape) print(np.array(all_h).shape) 3.2 PyTorch nn.GRUCell import torch import torch.nn as nn class GRUCell(nn.Module): def init(self,input_size,hidden_size):super(GRUCell,self).init()self.input_size input_sizeself.hidden_size hidden_sizeself.gru_cellnn.GRUCell(input_size,hidden_size)def forward(self,x):h_tself.gru_cell(x)return h_t# 测试数据 input_size3 hidden_size2 seq_len4 gru_modelGRUCell(input_size,hidden_size) xtorch.randn(seq_len,input_size) for t in range(seq_len):h_tgru_model(x[t])print(h_t) nn.GRU import torch import torch.nn as nn class GRU(nn.Module):def init(self,input_size,hidden_size):super(GRU,self).init()self.input_size input_sizeself.hidden_size hidden_sizeself.grunn.GRU(input_size,hidden_size)def forward(self,x):out,_self.gru(x)return out# 测试数据 input_size3 hidden_size2 seq_len4 batch_size5 xtorch.randn(seq_len,batch_size,input_size) gru_moselGRU(input_size,hidden_size) outgru_mosel(x) print(out) print(out.shape) 二、BiLSTM 1.概述双向长短期记忆网络BiLSTM是长短期记忆网络LSTM的扩展旨在同时考虑序列数据中的过去和未来信息。BiLSTM 通过引入两个独立的 LSTM 层一个正向处理输入序列另一个逆向处理使得每个时间步的输出包含了该时间步前后的信息。这种双向结构能够更有效地捕捉序列中的上下文关系从而提高模型对语义的理解能力。正向传递: 输入序列按照时间顺序被输入到第一个LSTM层。每个时间步的输出都会被计算并保留下来。反向传递: 输入序列按照时间的逆序即先输入最后一个元素被输入到第二个LSTM层。与正向传递类似每个时间步的输出都会被计算并保留下来。合并输出: 在每个时间步将两个LSTM层的输出通过某种方式合并如拼接或加和以得到最终的输出。 2. BILSTM模型应用背景命名体识别标注集 BMES标注集分词的标注集并非只有一种举例中文分词的情况汉子作为词语开始Begin结束End,中间Middle单字Single这四种情况就可以囊括所有的分词情况。于是就有了BMES标注集这样的标注集在命名实体识别任务中也非常常见。词性标注在序列标注问题中单词序列就是x词性序列就是y当前词词性的判定需要综合考虑前后单词的词性。而标注集最著名的就是863标注集和北大标注集。 3. 代码实现原生代码 import numpy as np import torch class BiLSTM():def init(self, input_size, hidden_size,output_size):self.input_size input_sizeself.hidden_size hidden_sizeself.output_size output_size#正向self.lstm_forward LSTM(input_size, hidden_size,output_size)#反向self.lstm_backward LSTM(input_size, hidden_size,outputsize)def forward(self,x):# 正向LSTMoutput,,_self.lstm_forward.forward(x)# 反向LSTM,np.flip()是将数组进行翻转outputbackward,,_self.lstm_backward.forward(np.flip(x,1))#合并两层的隐藏状态combine_output[np.concatenate((x,y),axis0) for x,y in zip(output,output_backward)]return combine_outputclass LSTM:def init(self, input_size, hidden_size,output_size)::param input_size: 词向量大小:param hidden_size: 隐藏层大小:param output_size: 输出类别self.input_size input_sizeself.hidden_size hidden_sizeself.output_size output_size # 初始化权重和偏置我们把结构图上的W U 拼接在了一起所以参数是 input_sizehidden_sizeself.w_f np.random.rand(hidden_size, input_sizehidden_size)self.b_f np.random.rand(hidden_size) self.w_i np.random.rand(hidden_size, input_sizehidden_size)self.b_i np.random.rand(hidden_size) self.w_c np.random.rand(hidden_size, input_sizehidden_size)self.b_c np.random.rand(hidden_size) self.w_o np.random.rand(hidden_size, input_sizehidden_size)self.b_o np.random.rand(hidden_size) # 输出层self.w_y np.random.rand(output_size, hidden_size)self.b_y np.random.rand(output_size) def tanh(self,x):return np.tanh(x) def sigmoid(self,x):return 1/(1np.exp(-x)) def forward(self,x):h_t np.zeros((self.hidden_size,)) # 初始隐藏状态c_t np.zeros((self.hidden_size,)) # 初始细胞状态 h_states [] # 存储每个时间步的隐藏状态c_states [] # 存储每个时间步的细胞状态 for t in range(x.shape[0]):x_t x[t] # 当前时间步的输入# concatenate 将x_t和h_t拼接垂直方向x_t np.concatenate([x_t,h_t]) # 遗忘门f_t self.sigmoid(np.dot(self.w_f,x_t)self.b_f) # 输入门i_t self.sigmoid(np.dot(self.w_i,x_t)self.b_i)# 候选细胞状态c_hat_t self.tanh(np.dot(self.w_c,x_t)self.b_c) # 更新细胞状态c_t f_t*c_t i_t*c_hat_t # 输出门o_t self.sigmoid(np.dot(self.w_o,x_t)self.b_o)# 更新隐藏状态h_t o_t*self.tanh(c_t) # 保存每个时间步的隐藏状态和细胞状态h_states.append(h_t)c_states.append(c_t) # 输出层对最后一个时间步的隐藏状态进行预测分类类别y_t np.dot(self.w_y,h_t)self.b_y# 转成张量形式 dim 0 表示行的维度output torch.softmax(torch.tensor(y_t),dim0) return np.array(h_states), np.array(c_states), output

测试数据

input_size3 hidden_size8 output_size5 seq_len4 xnp.random.randn(seq_len,input_size) bilstmBiLSTM(input_size,hidden_size,output_size) outputsbilstm.forward(x) print(outputs) print(np.array(outputs).shape) # ————————————————————————— import numpy as np

创建一个包含两个二维数组的列表

inputs [np.array([[0.1], [0.2], [0.3]]), np.array([[0.4], [0.5], [0.6]])]

使用 numpy 库中的 np.stack 函数。这会将输入的二维数组堆叠在一起从而形成一个新的三维数组

inputs_3d np.stack(inputs)

将三维数组转换为列表

list_from_3d_array inputs_3d.tolist() print(list_from_3d_array)Pytorch import torch import torch.nn as nn class BiLSTM(nn.Module):def init(self, input_size, hidden_size,output_size):super(BiLSTM, self).init()#定义双向LSTMself.lstmnn.LSTM(input_size,hidden_size,bidirectionalTrue)#输出层因为双向LSTM的输出是双向的所以第一个参数是隐藏层*2self.linearnn.Linear(hidden_size*2,output_size) def forward(self,x):out,_self.lstm(x)linear_outself.linear(out)return linear_out# 测试数据 input_size3 hidden_size8 output_size5 seq_len4 batch_size6 xtorch.randn(seq_len,batch_size,input_size) modelBiLSTM(input_size,hidden_size,output_size) outputsmodel(x) print(outputs) print(outputs.shape)