About
SubSift is a collection of tools developed in Bristol for matching academic researchers and text. The first use case was to match submitted papers to potential reviewers, first for conferences (KDD’09, ECMK-PKDD’12) and later for journals (Machine Learning). It works by converting text documents to a bag-of-word representation using TF-IDF, and calculating pairwise cosine similarities. For researchers, publication titles as listed on their DBLP page are converted to BoW.