Does using large libraries inherently make slower code?

前端 未结 17 1539
情深已故
情深已故 2021-02-06 20:40

I have a psychological tic which makes me reluctant to use large libraries (like GLib or Boost) in lower-level languages like C and C++. In my mind, I think:

相关标签:
17条回答
  • 2021-02-06 21:22

    As others have said, there is some overhead when adding a dynamic library. When the library is first loaded, relocations must be performed, although this should be a minor cost if the library is compiled correctly. The cost of looking up individual symbols is also increased since the number of libraries that need to be searched is increased.

    The cost in memory of adding another dynamic library depends largely on how much of it you actually use. A page of code will not be loaded from disk until something on it is executed. However, other data such as headers, symbol tables, and hash tables built into the library file will be loaded, and these are generally proportional to the size of the library.

    There is a great document by Ulrich Drepper, the lead contributor to glibc, that describes the process and the overhead of dynamic libraries.

    0 讨论(0)
  • 2021-02-06 21:23

    Depends on how the linker works. Some linkers are lazy and will include all the code in library. The more efficient linkers will only extract the needed code from a library. I have had experience with both types.

    Smaller libraries will have less worries with either type of linker. Worst case with a small library is small amounts of unused code. Many small libraries may increase the build time. The trade off would be build time vs. code space.

    An interesting test of the linker is the classic Hello World program:

    #include <stdio>
    #include <stdlib>
    int main(void)
    {
      printf("Hello World\n");
      return EXIT_SUCCESS;
    }
    

    The printf function has a lot of dependencies due to all the formatting that it may need. A lazy, but fast linker may include a "standard library" to resolve all the symbols. A more efficient library will only include printf and its dependencies. This makes the linker slower.

    The above program can be compared to this one using puts:

    #include <stdio>
    #include <stdlib>
    int main(void)
    {
      puts("Hello World\n");
      return EXIT_SUCCESS;
    }
    

    Generally, the puts version should be smaller than the printf version, because puts has no formatting needs thus less dependencies. Lazy linkers will generate the same code size as the printf program.

    In summary, library size decisions have more dependencies on the linker. Specifically, the efficiency of the linker. When in doubt, many small libraries will rely less on the efficiency of the linker, but make the build process more complicated and slower.

    0 讨论(0)
  • 2021-02-06 21:29

    There may be a small overhead when loading these libraries if they're dynamically linked. This will typically be a tiny, tiny fraction of the time your program spends running.

    However there will be no overhead once everything is loaded.

    If you don't want to use all of boost, then don't. It's modular, so you can use the parts you want and ignore the rest.

    0 讨论(0)
  • 2021-02-06 21:29

    You are very right to be worried, especially when it comes to boost. It's not so much due to anyone writing them being incompetent but due to two issues.

    1. Templates are just inherently bloated code. This didn't matter as much 10 years ago, but nowadays the CPU is much faster than memory access and this trend continues. I'd almost say templates are an obsolescent feature.

    It's not so bad for user code which is usually somewhat practical, but in many libraries everything is defined in terms of other templates or template on on multiple items (meaning exponential template code explosions).

    Simply adding in iostream adds about 3 mb (!!!) to your code. Now add in some boost nonsense and you have 30 mb of code if you sinply declare a couple of particularly weird data structures.

    Worse, you can't even easily profile this. I can tell you the difference between code written by me and code from template libraries is DRAMATIC but for a more naieve approach you may think you are doing worse from a simple test, but the cost in code bloat will take its tool in a large realworld app.

    1. Complexity. When you look at the things in Boost, they are all things that complicate your code to a huge degree. Things like smart pointers, functors, all sorts of complicated stuff. Now, I won't say it's never a good idea to use this stuff, but pretty much all of it has a big cost of some kind. Especially if you don't understand exactly, I mean exactly, what it's doing.

    But people rave about it and pretend it has something to do with 'design' so people get the impression it is the way you should do everything, not just some extremely specialized tools that should be used seldom. If ever.

    0 讨论(0)
  • 2021-02-06 21:30

    Even if I only use one or two features of a large library, by linking to that library am I going to incur runtime performance overheads?

    In general, no.

    If the library in question doesn't have a lot of position-independent code, then there will be a start-up cost while the dynamic linker performs relocations on the library when it's requested. Usually, that's part of the program's start-up. There is no run-time performance effect beyond that.

    Linkers are also good at removing "dead code" from statically-linked libraries at build time, so any static libraries you use will have minimal size overhead. Performance doesn't even enter into it.

    Frankly, you're worrying about the wrong things.

    0 讨论(0)
  • 2021-02-06 21:30

    Bigger doesn't inherently imply slower. Contrary to some of the other answers, there's no inherent difference between libraries stored entirely in headers and libraries stored in object files either.

    Header-only libraries can have an indirect advantage. Most template-based libraries have to be header-only (or a lot of the code ends up in headers anyway), and templates do give a lot of opportunities for optimization. Taking code in a typical object-file library and moving it all into headers will not, however, usually have many good effects (and could lead to code bloat).

    The real answer for a particular library will usually depend on its overall structure. It's easy to think of "Boost" as something huge. In fact, it's a huge collection of libraries, most of which are individually quite small. You can't say very much (meaningfully) about Boost as a whole, because the individual libraries are written by different people, with different techniques, goals, etc. A few of them (e.g. Format, Assign) really are slower than almost anything you'd be very likely to do on your own. Others (e.g. Pool) provide things you could do yourself, but probably won't, to get at least minor speed improvements. A few (e.g. uBlas) use heavy-duty template magic to run faster than any but a tiny percentage of us can hope to achieve on our own.

    There are, of course, quite a few libraries that really are individually large libraries. In quite a few cases, these really are slower than what you'd write yourself. In particular, many (most?) of them attempt to be much more general than almost anything you'd be at all likely to write on your own. While that doesn't necessarily lead to slower code, there's definitely a strong tendency in that direction. Like with a lot of other code, when you're developing libraries commercially, customers tend to be a lot more interested in features than things like size of speed.

    Some libraries also devote a lot of space, code (and often at least bits of time) to solving problems you may very well not care about at all. Just for example, years ago I used an image processing library. Its support for 200+ image formats sounded really impressive (and in a way it really was) but I'm pretty sure I never used it to deal with more than about a dozen formats (and I could probably have gotten by supporting only half that many). OTOH, even with all that it was still pretty fast. Supporting fewer markets might have restricted their market to the point that the code would actually have been slower (just for example, it handled JPEGs faster than IJG).

    0 讨论(0)
提交回复
热议问题