transformer

Error converting Pegasus to the ONNX format from Transformers

◇◆丶佛笑我妖孽 提交于 2021-02-10 14:21:55
问题 I am trying to convert the Pegasus newsroom in HuggingFace's transformers model to the ONNX format. I followed this guide published by Huggingface. After installing the prereqs, I ran this code: !rm -rf onnx/ from pathlib import Path from transformers.convert_graph_to_onnx import convert convert(framework="pt", model="google/pegasus-newsroom", output=Path("onnx/google/pegasus-newsroom.onnx"), opset=11) and got these errors: ValueError Traceback (most recent call last) <ipython-input-9

Typechecking after running Typescript compiler plugin/transformer

泪湿孤枕 提交于 2021-02-10 13:16:29
问题 I'm following a blog (https://dev.doctorevidence.com/how-to-write-a-typescript-transform-plugin-fc5308fdd943) on how to write a Typescript compiler plugin/transformer. After applying a first simple transformation which should introduce a type-error (some property accessed on an object that doesn't have that property) I noticed that the no type-error is shown. In fact, the compiler proceeds as normal. import * as ts from "typescript"; export const transformerFactory = ( program: ts.Program ):

Typechecking after running Typescript compiler plugin/transformer

亡梦爱人 提交于 2021-02-10 13:16:12
问题 I'm following a blog (https://dev.doctorevidence.com/how-to-write-a-typescript-transform-plugin-fc5308fdd943) on how to write a Typescript compiler plugin/transformer. After applying a first simple transformation which should introduce a type-error (some property accessed on an object that doesn't have that property) I noticed that the no type-error is shown. In fact, the compiler proceeds as normal. import * as ts from "typescript"; export const transformerFactory = ( program: ts.Program ):

Get probability of multi-token word in MASK position

橙三吉。 提交于 2020-12-05 11:57:31
问题 It is relatively easy to get a token's probability according to a language model, as the snippet below shows. You can get the output of a model, restrict yourself to the output of the masked token, and then find the probability of your requested token in the output vector. However, this only works with single-token words, e.g. words that are themselves in the tokenizer's vocabulary. When a word does not exist in the vocabulary, the tokenizer will chunk it up into pieces that it does know (see

Get probability of multi-token word in MASK position

梦想与她 提交于 2020-12-05 11:55:01
问题 It is relatively easy to get a token's probability according to a language model, as the snippet below shows. You can get the output of a model, restrict yourself to the output of the masked token, and then find the probability of your requested token in the output vector. However, this only works with single-token words, e.g. words that are themselves in the tokenizer's vocabulary. When a word does not exist in the vocabulary, the tokenizer will chunk it up into pieces that it does know (see

How to reconstruct text entities with Hugging Face's transformers pipelines without IOB tags?

旧时模样 提交于 2020-05-15 05:13:10
问题 I've been looking to use Hugging Face's Pipelines for NER (named entity recognition). However, it is returning the entity labels in inside-outside-beginning (IOB) format but without the IOB labels. So I'm not able to map the output of the pipeline back to my original text. Moreover, the outputs are masked in BERT tokenization format (the default model is BERT-large). For example: from transformers import pipeline nlp_bert_lg = pipeline('ner') print(nlp_bert_lg('Hugging Face is a French

How to reconstruct text entities with Hugging Face's transformers pipelines without IOB tags?

若如初见. 提交于 2020-05-15 05:13:07
问题 I've been looking to use Hugging Face's Pipelines for NER (named entity recognition). However, it is returning the entity labels in inside-outside-beginning (IOB) format but without the IOB labels. So I'm not able to map the output of the pipeline back to my original text. Moreover, the outputs are masked in BERT tokenization format (the default model is BERT-large). For example: from transformers import pipeline nlp_bert_lg = pipeline('ner') print(nlp_bert_lg('Hugging Face is a French

BERT token importance measuring issue. Grad is none

我只是一个虾纸丫 提交于 2020-04-30 06:36:26
问题 I am trying to measure token importance for BERT via comparing token embedding grad value. So, to get the grad, I've copied the 2.8.0 forward of BertModel and changed it a bit: huggingface transformers 2.8.0 BERT https://github.com/huggingface/transformers/blob/11c3257a18c4b5e1a3c1746eefd96f180358397b/src/transformers/modeling_bert.py Code: embedding_output = self.embeddings( input_ids=input_ids, position_ids=position_ids, token_type_ids=token_type_ids, inputs_embeds=inputs_embeds ) embedding

Error Transforming Document to String

谁都会走 提交于 2020-02-07 03:38:49
问题 Let's see this link : HTML DOM Tree to String - Transformer NullPointerException I got the same problem with him..but he can't solve it. I don't want to change from JBrowser to DJ Project. I'm curious with this problem. Any idea what's wrong with this problem? Thanks! Edit: HTML file : http://www.uploadmb.com/dw.php?id=1372739472 This is method to transform document to string public String getStringFromDocument(org.w3c.dom.Document doc) { StringWriter sw = new StringWriter(); try { doc =