pipeline

How do I use output from awk in another command?

倾然丶 夕夏残阳落幕 提交于 2019-12-08 15:58:32
问题 So I need to convert a date to a different format. With a bash pipeline, I'm taking the date from the last console login, and pulling the relevant bits out with awk, like so: last $USER | grep console | head -1 | awk '{print $4, $5}' Which outputs: Aug 08 ($4=Aug $5=08, in this case.) Now, I want to take 'Aug 08' and put it into a date command to change the format to a numerical date. Which would look something like this: date -j -f %b\ %d Aug\ 08 +%m-%d Outputs: 08-08 The question I have is,

Use a metric after a classifier in a Pipeline

不打扰是莪最后的温柔 提交于 2019-12-08 15:28:43
问题 I continue to investigate about pipeline. My aim is to execute each step of machine learning only with pipeline. It will be more flexible and easier to adapt my pipeline with an other use case. So what I do: Step 1: Fill NaN Values Step 2: Transforming Categorical Values into Numbers Step 3: Classifier Step 4: GridSearch Step 5: Add a metrics (failed) Here is my code: import pandas as pd from sklearn.base import BaseEstimator, TransformerMixin from sklearn.feature_selection import SelectKBest

Using Stream API for organising application pipeline

谁都会走 提交于 2019-12-08 12:45:28
问题 As far as I know Stream API is intended to be applied on collections. But I like the idea of them so much that I try to apply them when I can and when I shouldn't. Originally my app had two threads communicating through BlockingQueue . First would populate new elements. Second make transformations on them and save on disk. Looked like a perfect stream oportunity for me at a time. Code I ended up with: Stream.generate().flatten().filter().forEach() I'd like to put few map s in there but turns

How can I know the updated file during the gitlabci pipeline

梦想与她 提交于 2019-12-08 10:36:17
问题 During the gitlab pipeline (triggered after each commit on my branch), I want to know which files are concerned by the commit in order to apply specific bash script regarding each file. I'm currently using the following code in my gitlabci.yaml file: - export DIFF=$(git show --stat HEAD) - ./myBashScript.sh Then I'm using $DIFF in my bash script. But is there a better approach? (I'm using a local gitlab 10.8) 回答1: You can use already existing CI variables to do something like this to retrieve

How to pickle individual steps in sklearn's Pipeline?

﹥>﹥吖頭↗ 提交于 2019-12-08 08:21:49
问题 I am using Pipeline from sklearn to classify text. In this example Pipeline , I have a TfidfVectorizer and some custom features wrapped with FeatureUnion and a classifier as the Pipeline steps, I then fit the training data and do the prediction: from sklearn.pipeline import FeatureUnion, Pipeline from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.svm import LinearSVC X = ['I am a sentence', 'an example'] Y = [1, 2] X_dev = ['another sentence'] # classifier LinearSVC1 =

Building/Waiting for parent job Latest version

南笙酒味 提交于 2019-12-08 08:03:42
问题 We have maven projects on git with structure of -- pro-A -- pro-B -- pro-C pro-D -- pro-E These are all project with their own repo in git and their own build-pipeline in jenkins with stages as follows build -- deploy to TEST -- run tests -- (manual tigger) deploy to QA every build gets deployed to maven repo with jenkins build number appended to it and merge to release branch from master and tag with the new version number: e.g. 1.0.9-649 So, pro-A is parent of all projects, pro-B only

Invalid parameter clf for estimator Pipeline in sklearn

女生的网名这么多〃 提交于 2019-12-08 08:01:29
问题 Could anyone check problems with the following code? Am I wrong in any steps in building my model? I already added two 'clf__' to parameters. clf=RandomForestClassifier() pca = PCA() pca_clf = make_pipeline(pca, clf) kfold = KFold(n_splits=10, random_state=22) parameters = {'clf__n_estimators': [4, 6, 9], 'clf__max_features': ['log2', 'sqrt','auto'],'clf__criterion': ['entropy', 'gini'], 'clf__max_depth': [2, 3, 5, 10], 'clf__min_samples_split': [2, 3, 5], 'clf__min_samples_leaf': [1,5,8] }

Why does sklearn Pipeline call transform() so many more times than fit()?

大兔子大兔子 提交于 2019-12-08 07:39:37
问题 After a lot of reading and inspecting the pipeline.fit() operation under different verbose param settings, I'm still confused why a pipeline of mine visits a certain step's transform method so many times. Below is a trivial example pipeline , fit with GridSearchCV , using 3-fold cross-validation, but a param-grid with only one set of hyperparams. So I expected three runs through the pipeline. Both step1 and step2 have fit called three times, as expected, but each step has transform called

How to programmatically add a PSCmdlet to a Powershell pipeline?

泪湿孤枕 提交于 2019-12-08 06:45:02
问题 I'd like to programmatically assemble and run a pipeline containing my own PSCmdlet . However, the Pipeline class only allows to add strings and Commands (which are constructed from strings in turn). var runspace = ...; var pipeline = runspace.CreatePipeline(); pipeline.AddCommand("Get-Date"); // ok var myCmdlet = new MyCmdlet(); pipeline.AddCommand(myCmdlet); // Doesn't compile - am I fundamentally // misunderstanding some difference between commands and commandlets? foreach(var res in

How do I split a string on a delimiter ` in Bash?

喜夏-厌秋 提交于 2019-12-08 05:52:04
问题 How can I get modules/bx/motif only on the following through pipeline? $ find . | grep denied find: `modules/bx/motif': Permission denied 回答1: Simply this, using sed : find . 2>&1 | sed 's/^[^:]*: .\(.*\).: Permission denied/\1/p;d' or by using bash only, As your question stand for bash: string=$'find: `modules/bx/motif\047: Permission denied' echo $string find: `modules/bx/motif': Permission denied part=${string#*\`} echo ${part%\'*} modules/bx/motif 回答2: You can redirect STDOUT (where the