data-science

Finding the closest coordinates to a point

落花浮王杯 提交于 2021-01-29 11:55:47
问题 I have coordinates of city: (52.2319581, 21.0067249) and Python dictionary with cities around mentioned city. How to get 3 closest cities from given coordinates: ({'Brwinów': (52.133333, 20.716667), 'Warszawa Bielany': (52.283333, 20.966667), 'Legionowo': (52.4, 20.966667), 'Warszawa-Okęcie': (52.16039, 20.961674), 'Warszawa': (52.280957, 20.961348), 'Belsk Duży': (51.833333, 20.8)}, {}) Thanks for help. 回答1: Without any external libraries from math import acos, cos, sin def gc_distance(first

Is it a good idea to exclude noisy data from the dataset to train the model?

旧时模样 提交于 2021-01-29 11:39:02
问题 Will it be a good idea to exclude the noisy data ( which may reduce model accuracy or cause unexpected output for testing dataset) from a dataset to generate the training and validation dataset ? Assumption: Noisy data is pre-known to us Any suggestion is deeply appreciated! 回答1: It depends on your application. If the noisy data is valid , then definitely include it to find the best model. However, if the noisy data is invalid , then it should be cleaned out before fitting your model. Noise

Modeling noisy 1/x data in R, getting “essentially perfect fit” from summary - why? [closed]

三世轮回 提交于 2021-01-29 11:14:19
问题 Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 1 year ago . Improve this question Just trying to walk myself through how fitting a reciprocal function to data would go, using the following toy example: # includes library(ggplot2) library(forecast) library(scales) # make data sampledata <- as.data.frame( .1 * seq(1, 20)) names(sampledata) <- c

Python connect composed keywords in texts

狂风中的少年 提交于 2021-01-29 10:11:38
问题 So, I have a keyword list lowercase. Let's say keywords = ['machine learning', 'data science', 'artificial intelligence'] and a list of texts in lowercase. Let's say texts = [ 'the new machine learning model built by google is revolutionary for the current state of artificial intelligence. it may change the way we are thinking', 'data science and artificial intelligence are two different fields, although they are interconnected. scientists from harvard are explaining it in a detailed

Working from raw source data from a filepath containing spaces using make and Makefiles

放肆的年华 提交于 2021-01-29 09:43:34
问题 I have a repository that uses python scripts and a Makefile. I want to have a setup procedure that allows them to easily set up an environment and copy in the necessary data files from our server. The problem with including the source data files in the Makefile is that the company server uses spaces in the drive name, which make doesn't like very much, so I can list those files as dependencies for the target output file. My current Makefile basically does only the following: .PHONY : all all

Multiple plots with matplotlib in Python

旧巷老猫 提交于 2021-01-29 07:21:05
问题 I'm to Python and learning it by doing. I want to make two plots with matplotlib in Python. The second plot keeps the limits of first one. Wonder how I can change the limits of each next plot from previous. Any help, please. What is the recommended method? X1 = [80, 100, 120, 140, 160, 180, 200, 220, 240, 260] Y1 = [70, 65, 90, 95, 110, 115, 120, 140, 155, 150] from matplotlib import pyplot as plt plt.plot( X1 , Y1 , color = "green" , marker = "o" , linestyle = "solid" ) plt.show() X2 = [80,

How to fix ''ValueError: Input 0 is incompatible with layer flatten: expected min_ndim=3, found ndim=2" error when loading model

风格不统一 提交于 2021-01-29 05:33:56
问题 I'm trying to save and load my keras model. It trains, evaluates, and saves fine (using .h5 to save model) but when I try to load the model I get the following error: ValueError: Input 0 is incompatible with layer flatten: expected min_ndim=3, found ndim=2. Am I loading the model incorrectly? Any help would be appreciated! This is the code block from where I'm saving the model. def ml(self): model = tf.keras.models.Sequential() model.add(tf.keras.layers.Flatten()) self.addLayer(model,145,6)

Matplotlib magic in Jupyter notebook

时光怂恿深爱的人放手 提交于 2021-01-28 19:32:24
问题 When i am using the magic %matplotlib notebook it shows the follwing warning: Warning: Cannot change to a different GUI toolkit: notebook. Using gtk3 instead. What is the reason for this warning and how to get rid of it. 来源: https://stackoverflow.com/questions/61625435/matplotlib-magic-in-jupyter-notebook

KMeans clustering unbalanced data

只谈情不闲聊 提交于 2021-01-28 18:57:55
问题 I have a set of data with 50 features (c1, c2, c3 ...), with over 80k rows. Each row contains normalised numerical values (ranging 0-1). It is actually a normalised dummy variable, whereby some rows have only few features, 3-4 (i.e. 0 is assigned if there is no value). Most rows have about 10-20 features. I used KMeans to cluster the data, always resulting in a cluster with a large number of members. Upon analysis, I noticed that rows with fewer than 4 features tends to get clustered together

sess.run() and “.eval()” in tensorflow programming

ⅰ亾dé卋堺 提交于 2021-01-28 12:12:11
问题 In Tensorflow programming, can someone please tell what is the difference between ".eval()" and "sess.run()". What do each of them do and when to use them? 回答1: A session object encapsulates the environment in which Tensor objects are evaluated. If x is a tf.Tensor object, tf.Tensor.eval is shorthand for tf.Session.run , where sess is the current tf.get_default_session . You can make session the default as below x = tf.constant(5.0) y = tf.constant(6.0) z = x * y with tf.Session() as sess: