Finding subcategories of a wikipedia category using category and categorylinks table

前端 未结 1 890
轻奢々
轻奢々 2020-12-18 17:00

I downloaded the category and categorylinks table sql.gz files from mediawiki and generated the required tables:

category and categorylinks

Manual for the ta

相关标签:
1条回答
  • 2020-12-18 17:32

    Categories alone have no hierachy. It’s the category pages that make the subcategorization work. So you will also have to get the page_id from the page table to be able to resolve this relation.

    It essentially works like this:

    1. Category’s cat_title is a page title.
    2. Find that page_title in the page table, get the page_id
    3. Use the page_id to get the category link in cl_from
    4. Get the parent category title from cl_to
    5. Repeat from 2
    0 讨论(0)
提交回复
热议问题