vaex

How do I troubleshoot ValueError: array is of length %s, while the length of the DataFrame is %s?

半城伤御伤魂 提交于 2021-01-28 18:51:15
问题 I'm trying to follow the example on this notebook. As suggested in this github thread: I've upped the ulimit to 9999. I've already converted the csv files to hdf5 My code fails when trying to open a single hdf5 file into a dataframe: df = vaex.open('data/chat_history_00.hdf5') Here's the rest of the code: import re import glob import vaex import numpy as np def tryint(s): try: return int(s) except: return s def alphanum_key(s): """ Turn a string into a list of string and number chunks. "z23a"

Python Vaex data type conversion

…衆ロ難τιáo~ 提交于 2021-01-24 11:37:05
问题 I'm utilizing the Vaex library in Python for a project; I'm still very new to Vaex so I apologize if this is elementary. I'm having an issue with a data type conversion. One of my columns 'Paid_at' has a datatype of str, and it should be a DateTime. df_paid.info What I've done so far is dropped na from my df as well as (tried to) used pandas' to_datetime() to convert the column but it isn't working. This has worked in a pd data frame, but I am doing something wrong as I am receiving the

Python Vaex data type conversion

风格不统一 提交于 2021-01-24 11:36:07
问题 I'm utilizing the Vaex library in Python for a project; I'm still very new to Vaex so I apologize if this is elementary. I'm having an issue with a data type conversion. One of my columns 'Paid_at' has a datatype of str, and it should be a DateTime. df_paid.info What I've done so far is dropped na from my df as well as (tried to) used pandas' to_datetime() to convert the column but it isn't working. This has worked in a pd data frame, but I am doing something wrong as I am receiving the

Convert large hdf5 dataset written via pandas/pytables to vaex

怎甘沉沦 提交于 2020-01-14 06:22:19
问题 I have a very large dataset I write to hdf5 in chunks via append like so: with pd.HDFStore(self.train_store_path) as train_store: for filepath in tqdm(filepaths): with open(filepath, 'rb') as file: frame = pickle.load(file) if frame.empty: os.remove(filepath) continue try: train_store.append( key='dataset', value=frame, min_itemsize=itemsize_dict) os.remove(filepath) except KeyError as e: print(e) except ValueError as e: print(frame) print(e) except Exception as e: print(e) The data is far

How change the point style in a vaex interactive Jupyter bqplot plot_widget to make individual points larger and visible?

三世轮回 提交于 2019-12-11 01:38:58
问题 I am evaluating vaex for an interactive outlier selection use case described at: Large plot: ~20 million samples, gigabytes of data Basically, I have some individual points which are outliers, and I want to see them on a graph to manually select them and them examine them further. The problem is that individual points become invisible if the rest of the dataset is too large. How to make such individual points visible? For example, if I generate a dataset with 1 billion points and one outlier

How to do interactive 2D scatter plot zoom / point selection in Vaex?

大兔子大兔子 提交于 2019-12-10 10:55:54
问题 I saw that it is possible to do it during the demo: https://youtu.be/2Tt0i823-ec?t=769 There, the presenter has a huge dataset, and can quickly zoom in by selecting a rectangle with the mouse. I also saw the "Interactive Widgets" section of the tutorial: https://docs.vaex.io/en/latest/tutorial.html#Interactive-widgets However, I was not able to easily replicate that setup. What are the minimal steps to achieve it? On Ubuntu 19.04 vaex 2.0.2, I have tried: python3 -m pip install --user vaex

How to do interactive 2D scatter plot zoom / point selection in Vaex?

你离开我真会死。 提交于 2019-12-06 07:30:37
I saw that it is possible to do it during the demo: https://youtu.be/2Tt0i823-ec?t=769 There, the presenter has a huge dataset, and can quickly zoom in by selecting a rectangle with the mouse. I also saw the "Interactive Widgets" section of the tutorial: https://docs.vaex.io/en/latest/tutorial.html#Interactive-widgets However, I was not able to easily replicate that setup. What are the minimal steps to achieve it? On Ubuntu 19.04 vaex 2.0.2, I have tried: python3 -m pip install --user vaex scipy pandas vaex-jupyter jupyter nbextension enable --py widgetsnbextension jupyter nbextension enable -