SPARQL limit the result for each value of a varible

前端 未结 1 371
醉话见心
醉话见心 2021-01-24 07:31

This is the minimum data required to reproduce the problem

@prefix : 
@prefix rdfs: 
@         


        
相关标签:
1条回答
  • 2021-01-24 08:23

    First some minimal data, with three artists, and some items for each one. I always stress the point of minimal data on Stack Overflow, because it's important for isolating the problem. In this case, you've still provided a relatively large query and a lot more data that we need. Since we know the problem is in how to group artists that are each related to a number of items, all the data needs here is some artists that are related to a number of items. Then we can retrieve them easily, and group them easily.

    @prefix : <urn:ex:> .
    
    :artist1 :p :a1, :a2, :a3, :a4 .
    :artist2 :p :b2, :b2, :b3, :b4, :b5 .
    :artist3 :p :c2 .
    

    Now, you can select artists and their items, and you can determine an index for each item. This method checks for each item how many other items there are that are less than equal to it (there's always at least one equal to it (itself), so the counts are essentially a 1-based index).

    prefix : <urn:ex:>
    
    select ?artist ?item (count(?item_) as ?pos){
      ?artist :p ?item_, ?item .
      filter (str(?item_) <= str(?item))
    }
    group by ?artist ?item
    
    -------------------------
    | artist   | item | pos |
    =========================
    | :artist1 | :a1  | 1   |
    | :artist1 | :a2  | 2   |
    | :artist1 | :a3  | 3   |
    | :artist1 | :a4  | 4   |
    | :artist2 | :b2  | 1   |
    | :artist2 | :b3  | 2   |
    | :artist2 | :b4  | 3   |
    | :artist2 | :b5  | 4   |
    | :artist3 | :c2  | 1   |
    -------------------------
    

    Now you can use having to filter on the position, so that you get at most two per artist:

    prefix : <urn:ex:>
    
    select ?artist ?item {
      ?artist :p ?item_, ?item .
      filter (str(?item_) <= str(?item))
    }
    group by ?artist ?item
    having (count(?item_) < 3)
    
    -------------------
    | artist   | item |
    ===================
    | :artist1 | :a1  |
    | :artist1 | :a2  |
    | :artist2 | :b2  |
    | :artist2 | :b3  |
    | :artist3 | :c2  |
    -------------------
    

    References

    Doing "n per each x" queries in SPARQL is kind of challenge, and there's no great solution for it yet. Some related reading that might help (be sure to check the comments on these questions and answers, too), include:

    • SPARQL using subquery with limit (subqueries with limits can sometimes be helpful)
    • How to select first N row of each group (canonical question, in my opinion, but has no answer, since there's no general answer)
    • Find the two nearest neighbors of points (recent question with a "hack" answer)
    0 讨论(0)
提交回复
热议问题