基于多核系统的并行线性RankSVM算法

现有的线性RankSVM已得到较有效的研究,但在训练大规模的线性Rank SVM时,过长的训练时间依然难以让人接受。通过对当前最先进算法Tree-TRON的分析可知,利用信任区域的牛顿迭代（trust region Newton method,TRON）去训练线性Rank SVM模型涉及大量的Hessian-vector内积（Hessian-vector product）计算,同时完成Hessian-vector内积计算又需计算大量的辅助变量和矩阵运算。为了有效地加速与Hessian-vector内积有关的计算,在多核系统下提出了一种高效的并行算法（命名为PRank SVM）用于提高大规模线性...

Full description

Saved in:

Bibliographic Details
Published in	计算机应用研究 Vol. 34; no. 1; pp. 46 - 51
Main Author	聂慧彭娇金晶李康顺
Format	Journal Article
Language	Chinese
Published	广东科技学院计算机系,广东东莞,523000%中山大学数据科学与计算机学院,广州,510006%华南农业大学数学与信息学院/软件学院,广州,510006 2017
Subjects	多核系统并行计算排序学习线性RankSVM模型线性RankSVM模型多核系统 linear RankSVM 排序学习并行计算 parallel computing learning to rank multi-core system
Online Access	Get full text
ISSN	1001-3695
DOI	10.3969/j.issn.1001-3695.2017.01.009

Cover

More Information
Summary:	现有的线性RankSVM已得到较有效的研究,但在训练大规模的线性Rank SVM时,过长的训练时间依然难以让人接受。通过对当前最先进算法Tree-TRON的分析可知,利用信任区域的牛顿迭代（trust region Newton method,TRON）去训练线性Rank SVM模型涉及大量的Hessian-vector内积（Hessian-vector product）计算,同时完成Hessian-vector内积计算又需计算大量的辅助变量和矩阵运算。为了有效地加速与Hessian-vector内积有关的计算,在多核系统下提出了一种高效的并行算法（命名为PRank SVM）用于提高大规模线性Rank SVM的训练速度。PRank SVM的特征主要体现为两个方面：训练数据按不同的查询划分为不同的子问题;在多核系统下,利用多核加速辅助变量和相关矩阵的计算。通过实验分析可知,相较于现有的算法（如Tree-TRON）,PRank SVM不仅可以有效地提高训练速度,而且可以有效地确保预测的准确率。
Bibliography:	Many effective linear RankSVM algorithms have been studied extensively. However, if making use of any one of them to deal with the large-scale linear RankSVM ,then it must be taken extremely lengthy training time. According to the analysis of the existing state-of-the-art algorithm Tree-TRON, if used trust region Newton method （TRON） to train the linear RankSVM, massive Hessian-vector products and the computation of the auxiliary variables could affect the training speed significantly. To efficiently accelerate these computations, this paper proposed an efficient parallel algorithm （ named PRankSVM） on multi-core systems. All in all, two important issues should be well handled when designing PRankSVM on multi-core systems. First, it divided the training set into several subsets in terms of different queries. Second, it efficiently utilized the great computational power of the multi-core system to improve the Hessian-vector products and the computation of the auxiliary variables. The experimental results show
ISSN:	1001-3695
DOI:	10.3969/j.issn.1001-3695.2017.01.009