Why does it take ages to install Pandas on Alpine Linux

前端 未结 8 1347
孤独总比滥情好
孤独总比滥情好 2020-11-29 18:19

I\'ve noticed that installing Pandas and Numpy (it\'s dependency) in a Docker container using the base OS Alpine vs. CentOS or Debian takes much longer. I created a little t

相关标签:
8条回答
  • 2020-11-29 19:11

    Just going to bring some of these answers together in one answer and add a detail I think was missed. The reason certain python libraries, particularly optimized math and data libraries, take so long to build on alpine is because the pip wheels for these libraries include binaries precompiled from c/c++ and linked against glibc, a common set of c standard libraries. Debian, Fedora, CentOS all (typically) use glibc, but alpine, in order to stay lightweight, uses musl-libc instead. c/c++ binaries build on a glibc system will not work on a system without glibc and the same goes for musl.

    Pip looks first for a wheel with the correct binaries, if it can't find one, it tries to compile the binaries from the c/c++ source and links them against musl. In many cases, this won't even work unless you have the python headers from python3-dev or build tools like make.

    Now the silver lining, as others have mentioned, there are apk packages with the proper binaries provided by the community, using these will save you the (sometimes lengthy) process of building the binaries.

    0 讨论(0)
  • 2020-11-29 19:15

    pandas is considered a community supported package, so the answers pointing to edge/testing are not going to work as Alpine does not officially support pandas as a core package (it still works, it's just not supported by the core Alpine developers).

    Try this Dockerfile:

    FROM python:3.8-alpine
    RUN echo "@community http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories \
    && apk add py3-pandas@community
    

    This works for the vanilla Alpine image too, using FROM alpine:3.12.

    0 讨论(0)
提交回复
热议问题