Which Deep Learning Algorithm does Spacy uses when we train Custom model?

后端 未结 1 702
广开言路
广开言路 2021-01-21 03:37

When we train custom model, I do see we have dropout and n_iter parameters to tune, but which deep learning algorithm does Spacy Uses to train Custom Models? Also, when Adding n

相关标签:
1条回答
  • 2021-01-21 03:58

    Which learning algorithm does spaCy use?

    spaCy has its own deep learning library called thinc used under the hood for different NLP models. for most (if not all) tasks, spaCy uses a deep neural network based on CNN with a few tweaks. Specifically for Named Entity Recognition, spacy uses:

    1. A transition based approach borrowed from shift-reduce parsers, which is described in the paper Neural Architectures for Named Entity Recognition by Lample et al. Matthew Honnibal describes how spaCy uses this on a YouTube video.

    2. A framework that's called "Embed. Encode. Attend. Predict" (Starting here on the video), slides here.

      • Embed: Words are embedded using a Bloom filter, which means that word hashes are kept as keys in the embedding dictionary, instead of the word itself. This maintains a more compact embeddings dictionary, with words potentially colliding and ending up with the same vector representations.

      • Encode: List of words is encoded into a sentence matrix, to take context into account. spaCy uses CNN for encoding.

      • Attend: Decide which parts are more informative given a query, and get problem specific representations.

      • Predict: spaCy uses a multi layer perceptron for inference.

    Advantages of this framework, per Honnibal are:

    1. Mostly equivalent to sequence tagging (another task spaCy offers models for)
    2. Shares code with the parser
    3. Easily excludes invalid sequences
    4. Arbitrary features are easily defined

    For a full overview, Matthew Honnibal describes how the model in this YouTube video. Slides could be found here.

    Note: This information is based on slides from 2017. The engine might have changed since then.

    When adding a new entity type, should we create a blank model or train an existing one?

    Theoretically, when fine-tuning a spaCy model with new entities, you have to make sure the model doesn't forget representations for previously learned entities. The best thing, if possible, is to train a model from scratch, but that might not be easy or possible due to lack of data or resources.

    0 讨论(0)
提交回复
热议问题