深度注意力机制模型

深度注意力机制模型（Deep Attention Matching Network）是开放领域多轮对话匹配模型。根据多轮对话历史和候选回复内容，排序出最合适的回复。网络结构如下

任务定义与建模

多轮对话匹配任务输入是多轮对话历史和候选回复，输出是回复匹配得分，根据匹配得分排序。

模型原理介绍

可以参考论文:http://aclweb.org/anthology/P18-1103.

数据格式说明

训练、预测、评估使用的数据示例如下，数据由三列组成，以制表符（'\t'）分隔，第一列是以空格分开的上文id，第二列是以空格分开的回复id，第三列是标签

286 642 865 36    87 25 693       0
17 54 975         512 775 54 6    1

注：本项目额外提供了分词预处理脚本（在preprocess目录下），可供用户使用，具体使用方法如下：

python tokenizer.py \
  --test_data_dir ./test.txt.utf8 \
  --batch_size 1 > test.txt.utf8.seg

代码结构说明

main.py：该项目的主函数，封装包括训练、预测的部分

config.py：定义了该项目模型的相关配置，包括具体模型类别、以及模型的超参数

reader.py：定义了读入数据，加载词典的功能

evaluation.py：定义评估函数

run.sh：训练、预测运行脚本

文件介绍

deep_attention_matching/: 存放深度注意力机制模型的主要执行文件

deep_attention_matching/data目录下存放ubuntu数据集

deep_attention_matching/model_files下存放训练好的模型数据.

preprocess:数据预处理流程

本例算法运行基于GPU，若采用CPU，请将参数--use_cuda该行删去即可

In[1]

#本例采用ubuntu对话数据集
!cd data/data11447/ && unzip -qo ubuntu.zip

In[25]

#模型训练，可通过修改data_path更换数据集
!python deep_attention_matching/main.py \
  --do_train True \
  --use_cuda \
  --data_path data/data11447/ubuntu/data_small.pkl \
  --save_path deep_attention_matching/model_files/ubuntu \
  --vocab_size 434512 \
  --_EOS_ 28270 \
  --batch_size 32

-----------  Configuration Arguments -----------
_EOS_: 28270
batch_size: 32
channel1_num: 32
channel2_num: 16
data_path: data/data11447/ubuntu/data_small.pkl
do_infer: False
do_train: True
emb_size: 200
ext_eval: False
learning_rate: 0.001
max_turn_len: 50
max_turn_num: 9
model_path: None
num_scan_data: 2
save_path: deep_attention_matching/model_files/ubuntu
stack_num: 5
use_cuda: True
vocab_size: 434512
word_emb_init: None
------------------------------------------------
begin memory optimization ...
2019-09-05 15:58:35
end memory optimization ...
2019-09-05 15:58:35
device count 1
theoretical memory usage: 
(8378.70401058197, 8777.689915847779, 'MB')
W0905 15:58:37.603806  1186 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0905 15:58:37.611320  1186 device_context.cc:267] device: 0, cuDNN Version: 7.3.
WARNING:root:
     You can try our memory optimize feature to save your memory usage:
         # create a build_strategy variable to set memory optimize option
         build_strategy = compiler.BuildStrategy()
         build_strategy.enable_inplace = True
         build_strategy.memory_optimize = True
         
         # pass the build_strategy to with_data_parallel API
         compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
             loss_name=loss.name, build_strategy=build_strategy)
      
     !!! Memory optimize is our experimental feature !!!
         some variables may be removed/reused internal to save memory usage, 
         in order to fetch the right value of the fetch_list, please set the 
         persistable property to true for each variable in fetch_list

         # Sample
         conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
         # if you need to fetch conv1, then:
         conv1.persistable = True

                 
I0905 15:58:37.884307  1186 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies
I0905 15:58:38.161104  1186 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
WARNING:root:
     You can try our memory optimize feature to save your memory usage:
         # create a build_strategy variable to set memory optimize option
         build_strategy = compiler.BuildStrategy()
         build_strategy.enable_inplace = True
         build_strategy.memory_optimize = True
         
         # pass the build_strategy to with_data_parallel API
         compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
             loss_name=loss.name, build_strategy=build_strategy)
      
     !!! Memory optimize is our experimental feature !!!
         some variables may be removed/reused internal to save memory usage, 
         in order to fetch the right value of the fetch_list, please set the 
         persistable property to true for each variable in fetch_list

         # Sample
         conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
         # if you need to fetch conv1, then:
         conv1.persistable = True

                 
share_vars_from is set, scope is ignored.
I0905 15:58:38.416013  1186 parallel_executor.cc:329] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies
I0905 15:58:38.489773  1186 build_strategy.cc:340] SeqOnlyAllReduceOps:0, num_trainers:1
start loading data ...
finish loading data ...
begin model training ...
2019-09-05 15:58:39
[1676 9116 5609 ... 1722 2436 5949]
processed: [0.00961538461538] ave loss: [0.7781140208244324]
processed: [0.0192307692308] ave loss: [0.7981151739756266]
processed: [0.0288461538462] ave loss: [0.6933611432711283]
processed: [0.0384615384615] ave loss: [0.7092911005020142]

In[21]

#模型评估,预测结果保存在--save_path中，此处仅打印出预测结果中的一例
#turns为多轮历史对话，response为回复，scores为回复得分
!python deep_attention_matching/main.py \
  --do_infer True \
  --use_cuda \
  --data_path ./data/data11447/ubuntu/data_small.pkl \
  --save_path deep_attention_matching/infer_result \
  --model_path deep_attention_matching/model_files/ubuntu/ \
  --vocab_size 434512 \
  --_EOS_ 28270 \
  --batch_size 1

-----------  Configuration Arguments -----------
_EOS_: 28270
batch_size: 1
channel1_num: 32
channel2_num: 16
data_path: ./data/data11447/ubuntu/data_small.pkl
do_infer: True
do_train: False
emb_size: 200
ext_eval: False
learning_rate: 0.001
max_turn_len: 50
max_turn_num: 9
model_path: deep_attention_matching/model_files/ubuntu/
num_scan_data: 2
save_path: deep_attention_matching/infer_result
stack_num: 5
use_cuda: True
vocab_size: 434512
word_emb_init: None
------------------------------------------------
W0905 15:46:10.321343   919 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 9.2, Runtime API Version: 9.0
W0905 15:46:10.325507   919 device_context.cc:267] device: 0, cuDNN Version: 7.3.
start loading data ...
finish loading data ...
test batch num: 1000
begin inference ...
2019-09-05 15:46:11
('turns:', array([[[393704,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [250191,  34296, 350284,  30835,  59150,  74395,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [180037,  28847,  88281, 115692, 413324, 279504, 354176,  20481,
         418397, 418397, 177048, 197682, 115692, 373516, 192382, 285320,
          20484,  20494, 229901,   9751,  20494,  11317,  20484, 347085,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [ 97341, 291041,  14781, 414881, 126529, 174798,   1828, 324795,
         324507, 227764,  20484,  54259,   7198, 296758, 259553, 354176,
         123155,  20484, 149834, 343709,    238,  20484, 106788,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [131277, 326026, 146729, 170184, 180037, 418453,  20484,  20494,
         229901,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [131277, 326026, 207572, 307055, 284678, 285320, 180037, 404776,
         364101,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [395666,  75844, 233777, 195724,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [ 81947, 427943, 257613,  20484, 373516,   1598, 395666, 233777,
         195724,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0],
        [262553, 368624,  15331, 115692, 107043, 343709,  27044, 307801,
         367146, 309360,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0,      0,      0,      0,      0,      0,      0,
              0,      0]]]))
('response:', array([[[131277],
        [104719],
        [285320],
        [191048],
        [393704],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0],
        [     0]]]))
('scores:', array([[-0.2420864]], dtype=float32))
finish test
2019-09-05 15:46:36

点击链接，使用AI Studio一键上手实践项目吧：https://aistudio.baidu.com/aistudio/projectdetail/122287

来源：oschina

链接：https://my.oschina.net/u/4067628/blog/4255292

标签

Ubuntu

cuda

infer

深度注意力机制Deep Attention Matching Network

深度注意力机制模型

任务定义与建模

模型原理介绍

数据格式说明

代码结构说明