WildcardQuery error in Solr

后端 未结 7 2235
花落未央
花落未央 2021-02-20 16:47

I use solr to search for documents and when trying to search for documents using this query \"id:*\", I get this query parser exception telling that it cannot parse

相关标签:
7条回答
  • 2021-02-20 17:12

    Lucene doesn't allow you to start WildcardQueries with an asterisk by default, because those are incredibly expensive queries and will be very, very, very slow on large indexes.

    If you're using the Lucene QueryParser, call setAllowLeadingWildcard(true) on it to enable it.

    If you want all of the documents with a certain field set, you are much better off querying or walking the index programmatically than using QueryParser. You should really only use QueryParser to parse user input.

    0 讨论(0)
  • 2021-02-20 17:14

    Actually, I have been using a workaround for this. I append a character to the id, eg: A1, A2, etc.

    With such values in the field, it is possible to search using the query id:A*

    But would love to find whether a true solution exists.

    0 讨论(0)
  • 2021-02-20 17:15

    If you are just trying to get all documents, Solr does support the *:* query. It's the only time I know of that Solr will let you begin a query with an *. I'm sure you've probably seen this as the default query in the Solr admin page.

    If you are trying to do a more specific query with an * as the first character, like say id:*456 then one of the best ways I've seen is to index that field twice. Once normally (field name: id), and once with all the characters reversed (field name: reverse_id). Then you could essentially do the query id:456 by sending the query reverse_id:654 instead. Hope that makes sense.

    You can also search the Solr user group mailing list at http://www.mail-archive.com/solr-user@lucene.apache.org/ where questions like this come up quite often.

    0 讨论(0)
  • 2021-02-20 17:20
    id:[a* TO z*] id:[0* TO 9*] etc.
    

    I just did this in lukeall on my index and it worked, therefore it should work in Solr which uses the standard query parser. I don't actually use Solr.

    In base Lucene there's a fine reason for why you'd never query for every document, it's because to query for a document you must use a new indexReader("DirectoryName") and apply a query to it. Therefore you could totally skip applying a query to it and use the indexReader methods numDocs() to get a count of all the documents, and document(int n) to retrieve any of the documents.

    0 讨论(0)
  • 2021-02-20 17:23

    I'm assuming with id:* you're just trying to match all documents, right?

    I've never used solr before, but in my Lucene experience, when ingesting data, we've added a hidden field to every document, then when we need to return every record we do a search for the string constant in that field that's the same for every record.

    If you can't add a field like that in your situation, you could use a RegexQuery with a regex that would match anything that could be found in the id field.

    Edit: actually answering the question. I've never heard of a patch to get that to work, but I would be surprised if it could even be made to work reasonably well. See this question for a reason why unconstrained PrefixQuery's can cause a problem.

    0 讨论(0)
  • 2021-02-20 17:30

    The following Solr issue is a request to be able to configure the default lucene query parser. https://issues.apache.org/jira/browse/SOLR-218

    In this issue you can find the following description how to 'patch' Solr. This modification would allow you to start queries with a *.

    Jonas Salk: I've basically updated only one Java file: SolrQueryParser.java.

    public SolrQueryParser(IndexSchema schema, String defaultField) { 
        ... 
        setAllowLeadingWildcard(true); 
        setLowercaseExpandedTerms(true); 
        ... 
    }
    
     ...
    
    public SolrQueryParser(QParser parser, String defaultField, Analyzer analyzer) {
        ... 
        setAllowLeadingWildcard(true); 
        setLowercaseExpandedTerms(true);
        ... 
    }
    

    I'm not sure if setLowercaseExpandedTerms is needed...

    0 讨论(0)
提交回复
热议问题