Unsupervised automatic tagging algorithms?

后端 未结 5 1686
挽巷
挽巷 2021-01-30 00:46

I want to build a web application that lets users upload documents, videos, images, music, and then give them an ability to search them. Thin

5条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2021-01-30 01:00

    These guys propose an alternative to LDA.

    Automatic Tag Recommendation Algorithms for Social Recommender Systems http://research.microsoft.com/pubs/79896/tagging.pdf

    Haven't read thru the whole paper but they have two algorithms:

    1. Supervised learning version. This isn't that bad. You can use Wikipedia to train the algorithm
    2. "Prototype" version. Haven't had a chance to go thru this but this is what they recommend

    UPDATE: I've researched this some more and I've found another approach. Basically, it's a two-stage approach that's very simple to understand and implement. While too slow for 100,000s of documents, it (probably) has good performance for 1000s of docs (so it's perfect for tagging a single user's documents). I'm going to try this approach and will report back on performance/usability.

    In the mean time, here's the approach:

    1. Use TextRank as per http://qr.ae/36RAP to generate a tag list for a single document. This generates a tag list for a single document independent of other documents.
    2. Use the algorithm from "Using Machine Learning to Support Continuous Ontology Development" (https://www.researchgate.net/publication/221630712_Using_Machine_Learning_to_Support_Continuous_Ontology_Development) to integrate the tag list (from step 1) into the existing tag list.

提交回复
热议问题