动态时间规整( DTW )
一、DTW( Dynamic Time Warping)
• 按照距离最近的原则,构建两个序列元素之间的对应的 关系,评估两个序列的相似性。
• 要求
(1)单向对应,不能回头
(2)一一对应,不能有空
(3)对应之后,距离最近
二、算法实现
三、代码实现
# from dtw import dtw
import numpy as np
def dis_abs(x, y):
return abs(x-y)[0]
def estimate_twf(A,B,dis_func=dis_abs):
N_A = len(A)
N_B = len(B)
D = np.zeros([N_A,N_B])
D[0,0] = dis_func(A[0],B[0])
# 左边一列
for i in range(1,N_A):
D[i,0] = D[i-1,0]+dis_func(A[i],B[0])
# 下边一行
for j in range(1,N_B):
D[0,j] = D[0,j-1]+dis_func(A[0],B[j])
# 中间部分
for i in range(1,N_A):
for j in range(1,N_B):
D[i,j] = dis_func(A[i],B[j])+min(D[i-1,j],D[i,j-1],D[i-1,j-1])
# 路径回溯
i = N_A-1
j = N_B-1
count =0
d = np.zeros(max(N_A,N_B)*3)
path = []
while True:
if i>0 and j>0:
path.append((i,j))
m = min(D[i-1, j],D[i, j-1],D[i-1,j-1])
if m == D[i-1,j-1]:
d[count] = D[i,j] - D[i-1,j-1]
i = i-1
j = j-1
count = count+1
elif m == D[i,j-1]:
d[count] = D[i,j] - D[i,j-1]
j = j-1
count = count+1
elif m == D[i-1, j]:
d[count] = D[i,j] - D[i-1,j]
i = i-1
count = count+1
elif i == 0 and j == 0:
path.append((i,j))
d[count] = D[i,j]
count = count+1
break
elif i == 0:
path.append((i,j))
d[count] = D[i,j] - D[i,j-1]
j = j-1
count = count+1
elif j == 0:
path.append((i,j))
d[count] = D[i,j] - D[i-1,j]
i = i-1
count = count+1
mean = np.sum(d) / count
return mean, path[::-1],D
if __name__=="__main__":
a = np.array([1,3,4,9,8,2,1,5,7,3])
b = np.array([1,6,2,3,0,9,4,1,6,3])
a = a[:,np.newaxis]
b = b[:,np.newaxis]
dis,path,D = estimate_twf(a,b,dis_func=dis_abs)
print(dis)
print(path)
print(D)
# print(a.shape)
# print(dtw.__file__)
# print("-------")
# f = dis_abs
# out = f(a,b)
# print("-------")
# print(out)
四、算法应用
• 计算两个序列之间的相似性 (取dis) 动作识别:
A : 录制动作时传感器数据 B:测试动作时传感器数据
• 获取匹配特征对 (取path) Voice conversion
A :说话人A的特征 B: 说话人B的特征
• 选择不同的dis 语音识别/英文发音质量检测
A :语音特征 B:文本序列标签 dis = 概率分布