pipeline

Are the k-fold cross-validation scores from scikit-learn's `cross_val_score` and `GridsearchCV` biased if we include transformers in the pipeline?

无人久伴 提交于 2019-12-01 18:44:52
问题 Data pre-processers such as StandardScaler should be used to fit_transform the train set and only transform (not fit) the test set. I expect the same fit/transform process applies to cross-validation for tuning the model. However, I found cross_val_score and GridSearchCV fit_transform the entire train set with the preprocessor (rather than fit_transform the inner_train set, and transform the inner_validation set). I believe this artificially removes the variance from the inner_validation set

Efficient XSLT pipeline, with params, in Java

你。 提交于 2019-12-01 18:31:58
问题 The top answer to this question describes a technique to implement an efficient XSLT pipeline in Java: Efficient XSLT pipeline in Java (or redirecting Results to Sources) Unfortunately, while Transformer seems to expose an API for setting XSLT parameters, this does not seem to have any effect. For example, I have the following code: Transformer.java import javax.xml.transform.sax.SAXTransformerFactory; import javax.xml.transform.Templates; import javax.xml.transform.sax.TransformerHandler;

Owin Stage Markers

强颜欢笑 提交于 2019-12-01 18:14:47
Given this in my app startup ... app.Use((context, next) => { return next.Invoke(); }).UseStageMarker(PipelineStage.PostAuthenticate); app.Use((context, next) => { return next.Invoke(); }).UseStageMarker(PipelineStage.Authenticate); ... why does the PostAuthenticate code execute before the Authenticate code? I don't mean "why does the first app.use get called before the second app.use" I mean: Why does the first invoke get called before the second given that that the second should be happening earlier in the request pipeline? EDIT Related to this problem: How am I getting a windows identity in

Fitting in nested cross-validation with cross_val_score with pipeline and GridSearch

房东的猫 提交于 2019-12-01 12:38:17
I am working in scikit and I am trying to tune my XGBoost. I made an attempt to use a nested cross-validation using the pipeline for the rescaling of the training folds (to avoid data leakage and overfitting) and in parallel with GridSearchCV for param tuning and cross_val_score to get the roc_auc score at the end. from imblearn.pipeline import Pipeline from sklearn.model_selection import RepeatedKFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import cross_val_score from xgboost import XGBClassifier std_scaling = StandardScaler() algo = XGBClassifier() steps

GStreamer force decodebin2 output type

落爺英雄遲暮 提交于 2019-12-01 09:23:55
I'm trying to write a program in C which replicates the pipeline: gst-launch -v filesrc location="bbb.mp4" ! decodebin2 ! ffmpegcolorspace ! autovideosink DecodeBin2 has a dynamic pad and I've attached a callback to handle its creation. I am unable to link it to ffmpegcolorspace however because the pad capability is always video/quicktime. I would like it to be video/x-raw-yuv or something else which is compatible with ffmpegcolorspace. Is this possible to force/select the output type of decodebin2? Thanks. EDIT: Please do not recommend playbin. I'm trying to learn how how to make pipelines.

Sklearn Pipeline: How to build for kmeans, clustering text?

拈花ヽ惹草 提交于 2019-12-01 07:54:42
I have text as shown : list1 = ["My name is xyz", "My name is pqr", "I work in abc"] The above will be training set for clustering text using kmeans. list2 = ["My name is xyz", "I work in abc"] The above is my test set. I have built a vectorizer and the model as shown below: vectorizer = TfidfVectorizer(min_df = 0, max_df=0.5, stop_words = "english", charset_error = "ignore", ngram_range = (1,3)) vectorized = vectorizer.fit_transform(list1) km=KMeans(n_clusters=2, init='k-means++', n_init=10, max_iter=1000, tol=0.0001, precompute_distances=True, verbose=0, random_state=None, copy_x=True, n

Type checked decomposable graph in Haskell

寵の児 提交于 2019-12-01 06:12:18
问题 Let's say I'm creating a data pipeline which will process text files. I have the following types and functions: data A = A deriving (Show, Typeable) data B = B deriving (Show, Typeable) data C = C deriving (Show, Typeable) data D = D deriving (Show, Typeable) step1 :: A -> B step2 :: B -> C step3 :: C -> D For each of the functions step{1..3} below I would like to be able to do produce a new file from an existing file, doing something like: interact (lines . map (show . step . read) . unlines

GStreamer force decodebin2 output type

谁都会走 提交于 2019-12-01 05:42:16
问题 I'm trying to write a program in C which replicates the pipeline: gst-launch -v filesrc location="bbb.mp4" ! decodebin2 ! ffmpegcolorspace ! autovideosink DecodeBin2 has a dynamic pad and I've attached a callback to handle its creation. I am unable to link it to ffmpegcolorspace however because the pad capability is always video/quicktime. I would like it to be video/x-raw-yuv or something else which is compatible with ffmpegcolorspace. Is this possible to force/select the output type of

Jenkins input pipeline step filled via POST with CSRF - howto?

Deadly 提交于 2019-12-01 04:45:21
I have Jenkins pipeline with an Input step, and I would like to submit this input(single string argument) via a script. So far I am trying with curl, ideally I'll be sending it via Python requests library. This should be an easy POST request, however with CSRF it becomes tricky. I've obtained Jenkins-Crumb (using curl in this case, from the same machine and same bash session), but still can't send the content... I'm sending Jenkins-Crumb:XXX header, just like it is explained at https://wiki.jenkins-ci.org/display/JENKINS/Remote+access+API my request looks like this: curl -vvv -X POST --user '$

Piping another parameter into the line in F#

我怕爱的太早我们不能终老 提交于 2019-12-01 04:42:44
Is piping parameter into line is working only for functions that accept one parameter? If we look at the example at Chris Smiths' page , // Using the Pipe-Forward operator (|>) let photosInMB_pipeforward = @"C:\Users\chrsmith\Pictures\" |> filesUnderFolder |> Seq.map fileInfo |> Seq.map fileSize |> Seq.fold (+) 0L |> bytesToMB where his filesUnderFolder function was expecting only rootFolder parameter, what if the function was expecting two parameters, i.e. let filesUnderFolder size rootFolder Then this does not work: // Using the Pipe-Forward operator (|>) let size= 4 let photosInMB