Is there a good and easy way to visualize high dimensional data?

后端 未结 9 509
借酒劲吻你
借酒劲吻你 2021-01-31 00:13

Can someone please tell me if there is a good (easy) way to visualize high dimensional data? My data is currently 21 dimensions but I would like to see how whether it is dense o

相关标签:
9条回答
  • 2021-01-31 00:24

    Principal component analysis could be helpful if the dimensions are correlated.

    0 讨论(0)
  • 2021-01-31 00:25

    The buzzword I would search for is multidimensional scaling. It is a technique to develop a projection from the high dimensional space to a lower space (2 or 3 dimensional) in such a way that points which are close in the full space will be close in the projection.

    It is often used for visualising the output of clustering algorithms (i.e. if your clusters are compact in the MDS projection there is a good chance they are also in the full space).

    Edit: This wouldn't necessarily help with determining if the data is dense or sparse, because you lose the scale in the projection, but it would show whether it is uniform or clumpy (perhaps thats what you mean).

    0 讨论(0)
  • 2021-01-31 00:27

    Parallel coordinates are a popular method for visualizing high-dimensional data.

    What kind of visualization is best for your data in particular will depend on its characteristics-- how correlated are the different dimensions?

    0 讨论(0)
  • 2021-01-31 00:27

    Not sure what kind of patterns you would like to see from the data. t-SNE and its faster variant Barnes-Hut-SNE do a very good job in visualizing groups of related concepts for high-dimensional data. It is available through R.

    There is a short tutorial on using it against high-dimensional data with about 300 dimensions. http://www.codeproject.com/Tips/788739/Visualizing-High-Dimensional-Vector-using-T-SNE-wi

    0 讨论(0)
  • 2021-01-31 00:28

    Try using http://hypertools.readthedocs.io/en/latest/.

    HyperTools is a library for visualizing and manipulating high-dimensional data in Python.

    0 讨论(0)
  • 2021-01-31 00:31

    The curios.IT data exploration software is designed for the visualization of high dimensional data: data is shown as a collection of 3D objects (one for each data group) which can show up to 13 variables at the same time. The relationships between data variables and visual features are much easier to remember than with other techniques (like parallel coordinates).

    0 讨论(0)
提交回复
热议问题