pipeline

Sklearn Pipeline: How to build for kmeans, clustering text?

末鹿安然 提交于 2019-12-01 04:19:57
问题 I have text as shown : list1 = ["My name is xyz", "My name is pqr", "I work in abc"] The above will be training set for clustering text using kmeans. list2 = ["My name is xyz", "I work in abc"] The above is my test set. I have built a vectorizer and the model as shown below: vectorizer = TfidfVectorizer(min_df = 0, max_df=0.5, stop_words = "english", charset_error = "ignore", ngram_range = (1,3)) vectorized = vectorizer.fit_transform(list1) km=KMeans(n_clusters=2, init='k-means++', n_init=10,

XSLT with XProc - parameter binding in the required type

会有一股神秘感。 提交于 2019-12-01 00:30:53
I'm trying to translate my batch file calling the Saxon (version 8.9) into a XProc pipeline (Calabash). This is my batch call: java -jar saxon8.jar -o out.xml in.xml style.xsl +config=config-file.cfg The parameter config is defined in the stylesheet in this way: <xsl:param name="config" as="document-node()"/> The XProc part looks like this: <p:load name="configLoad"> <p:with-option name="href" select="'config-file.cfg'"/> </p:load> <p:xslt name="config"> <p:input port="source"> <p:document href="in.xml"/> </p:input> <p:input port="parameters"> <p:inline> <c:param name="config"> <p:pipe port=

Fitting in nested cross-validation with cross_val_score with pipeline and GridSearch

你说的曾经没有我的故事 提交于 2019-11-30 20:06:13
问题 I am working in scikit and I am trying to tune my XGBoost. I made an attempt to use a nested cross-validation using the pipeline for the rescaling of the training folds (to avoid data leakage and overfitting) and in parallel with GridSearchCV for param tuning and cross_val_score to get the roc_auc score at the end. from imblearn.pipeline import Pipeline from sklearn.model_selection import RepeatedKFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import cross

XSLT with XProc - parameter binding in the required type

一笑奈何 提交于 2019-11-30 19:27:21
问题 I'm trying to translate my batch file calling the Saxon (version 8.9) into a XProc pipeline (Calabash). This is my batch call: java -jar saxon8.jar -o out.xml in.xml style.xsl +config=config-file.cfg The parameter config is defined in the stylesheet in this way: <xsl:param name="config" as="document-node()"/> The XProc part looks like this: <p:load name="configLoad"> <p:with-option name="href" select="'config-file.cfg'"/> </p:load> <p:xslt name="config"> <p:input port="source"> <p:document

Is it possible to pipe conditionally in Powershell, i.e. execute an element of a pipeline only if a condition is met?

你说的曾经没有我的故事 提交于 2019-11-30 11:38:41
I want to do something like this: <statement> | <filter1> | <filter2> if <condition> | <filter3> | <filter4> | <filter5> The results of <statement> run through <filter1>, then they run through <filter2> only if <condition> is met, then through the remaining filters regardless of whether <filter2> was applied. This is the equivalent of: if (<condition>) { <statement> | <filter1> | <filter2> | <filter3> | <filter4> | <filter5> } else { <statement> | <filter1> | <filter3> | <filter4> | <filter5> } This would be useful in functions where a given filter is applied to the result set only if a

Share gitlab-ci.yml between projects

孤者浪人 提交于 2019-11-30 07:35:33
We are thinking to move our ci from jenkins to gitlab. We have several projects that have the same build workflow. Right now we use a shared library where the pipelines are defined and the jenkinsfile inside the project only calls a method defined in the shared library defining the actual pipeline. So changes only have to be made at a single point affecting several projects. I am wondering if the same is possible with gitlab ci? As far as i have found out it is not possible to define the gitlab-ci.yml outside the repository. Is there another way to define a pipeline and share this config with

Service Fabric Reliable Services Pipeline design

社会主义新天地 提交于 2019-11-29 23:57:40
I need to implement pipeline if Service Fabric's Reliable Services, and I need some guidelines about what of these approaches is preferable from the viewpoint of reliability simplicity and simple good design: I have been investigating this topic a lot as well (to be applied to my work for NServiceBus and MessageHandler ) and would like to provide my thoughts on the matter. However I haven't determined what the best model is yet. If you disregard the practical implementation with ServiceFabric I would categorize the proposed approach in the following order when it comes to reliability: C) The

Is it possible to pipe conditionally in Powershell, i.e. execute an element of a pipeline only if a condition is met?

丶灬走出姿态 提交于 2019-11-29 17:23:18
问题 I want to do something like this: <statement> | <filter1> | <filter2> if <condition> | <filter3> | <filter4> | <filter5> The results of <statement> run through <filter1>, then they run through <filter2> only if <condition> is met, then through the remaining filters regardless of whether <filter2> was applied. This is the equivalent of: if (<condition>) { <statement> | <filter1> | <filter2> | <filter3> | <filter4> | <filter5> } else { <statement> | <filter1> | <filter3> | <filter4> | <filter5>

Recursive piping in Unix again

爷,独闯天下 提交于 2019-11-29 14:55:06
问题 I know this topic came up already several times, but I'm still stuck at one point. I need to write a program that emulates cmd1 | cmd2 | cmd3 ... piping. My code is here: http://ideone.com/fedrB8 #include <stdio.h> #include <unistd.h> #include <sys/types.h> #include <stdlib.h> void pipeline( char * ar[], int pos, int in_fd); void error_exit(const char*); static int child = 0; /* whether it is a child process relative to main() */ int main(int argc, char * argv[]) { if(argc < 2){ printf("Usage

Why piping to the same file doesn't work on some platforms?

断了今生、忘了曾经 提交于 2019-11-29 14:09:33
In cygwin, the following code works fine $ cat junk bat bat bat $ cat junk | sort -k1,1 |tr 'b' 'z' > junk $ cat junk zat zat zat But in the linux shell(GNU/Linux), it seems that overwriting doesn't work [41] othershell: cat junk cat cat cat [42] othershell: cat junk |sort -k1,1 |tr 'c' 'z' zat zat zat [43] othershell: cat junk |sort -k1,1 |tr 'c' 'z' > junk [44] othershell: cat junk Both environments run BASH. I am asking this because sometimes after I do text manipulation, because of this caveat, I am forced to make the tmp file. But I know in Perl, you can give "i" flag to overwrite the