2019 Interspeech
1. Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning
- 东京大学
- 端到端多任务学习with self attention,辅助任务是gender。
首先从语谱图提取特征speech spectrogram,而不是用手工特征。然后CNN-BLSTM E2E网络。随后用self attention mechanism聚焦到情感 salient periods。最后考虑到emotion and gender classification tasks之间的相互特征,结合了性别分类作为附加task,与主要任务emotion classification share有用的信息。 - 摘要从人机交互应用说明SER has attracted great attention,更有画面感。介绍,分别叙述了特征、语谱图的优越性 、HMM GMM SVM等traditional machine learning approaches, CNN RNN traditional machine learning approaches。
- multi-headed self attention
- 提取语谱图:长度归一化到7.5s,不足的补零,长的cut。Hanning windows 800。sampling rate 16000Hz.
短时傅里叶变换 - 和 是1
实验
IEMOCAP combine EXCITED and HAPPY into HAPPY 四类 一共5531samples。
实验结果对比有5-fold cross-validation(2018),也有leave-one-session-out。
2. Self-attention for Speech Emotion Recognition
- “Attention is all you need”2017 Available
based on an encoder-decoder structure that 没有使用任何 recurrence, but instead uses weighted
correlations between the elements of the input sequence
Transformer:把input sequence映射成a query, a key and a value
介绍了各种attention。 - 提出了 a global windowing system that works works on top of the local windows.
- classification and regression.
实验
5 fold cross validation.
因为happy少,换成了excited,这样balance。不知道这老哥比较的对不对,[2]也是excited 5折吗?
3. Deep Learning of Segment-Level Feature Representation with Multiple Instance Learning for Utterance-Level Speech Emotion Recognition
- Segment-Level
首先给每一个segment分类,utterance-level 的分类结果是segment-level的 aggregation
of the segment-level decisions
发现了(1) the aggregation of segment-level decisions provides richer information than the statistics over the low-level descriptors (LLDs) across the whole utterance;
(2)automatic feature learning outperforms manual features
其中SegMLP的输入是IS09(manually designed perceptual features),SegCNN的输入是 log Mel filterbanks.再分别接ELM SVM RF 一共6组实验
automatic feature learning outperforms manually designed perceptual features
aggregation的方法,把f matrix送入到三种分类网络 - multiple instance learning (MIL)
实验
两个库 CASIA IEMOCAP
来源:CSDN
作者:wangdapang_2
链接:https://blog.csdn.net/qq_38221026/article/details/103887135