Converting C source to C++

前端 未结 11 2015
迷失自我
迷失自我 2021-01-30 16:52

How would you go about converting a reasonably large (>300K), fairly mature C codebase to C++?

The kind of C I have in mind is split into files roughly corresponding to

相关标签:
11条回答
  • 2021-01-30 17:07

    Your application has lots of folks working on it, and a need to not-be-broken. If you are serious about large scale conversion to an OO style, what you need is massive transformation tools to automate the work.

    The basic idea is to designate groups of data as classes, and then get the tool to refactor the code to move that data into classes, move functions on just that data into those classes, and revise all accesses to that data to calls on the classes.

    You can do an automated preanalysis to form statistic clusters to get some ideas, but you'll still need an applicaiton aware engineer to decide what data elements should be grouped.

    A tool that is capable of doing this task is our DMS Software Reengineering Toolkit. DMS has strong C parsers for reading your code, captures the C code as compiler abstract syntax trees, (and unlike a conventional compiler) can compute flow analyses across your entire 300K SLOC. DMS has a C++ front end that can be used as the "back" end; one writes transformations that map C syntax to C++ syntax.

    A major C++ reengineering task on a large avionics system gives some idea of what using DMS for this kind of activity is like. See technical papers at www.semdesigns.com/Products/DMS/DMSToolkit.html, specifically Re-engineering C++ Component Models Via Automatic Program Transformation

    This process is not for the faint of heart. But than anybody that would consider manual refactoring of a large application is already not afraid of hard work.

    Yes, I'm associated with the company, being its chief architect.

    0 讨论(0)
  • 2021-01-30 17:10

    Let's throw another stupid idea:

    1. Compile everything in C++'s C subset and get that working.
    2. Start with a module, convert it in a huge class, then in an instance, and build a C interface (identical to the one you started from) out of that instance. Let the remaining C code work with that C interface.
    3. Refactor as needed, growing the OO subsystem out of C code one module at a time, and drop parts of the C interface when they become useless.
    0 讨论(0)
  • 2021-01-30 17:10

    Here's what I would do:

    • Since the code is 20 years old, scrap down the parser/syntax analyzer and replace it with one of the newer lex/yacc/bison(or anything similar) etc based C++ code, much more maintainable and easier to understand. Faster to develop too if you have a BNF handy.
    • Once this is retrofitted to the old code, start wrapping modules into classes. Replace global/shared variables with interfaces.
    • Now what you have will be a compiler in C++ (not quite though).
    • Draw a class diagram of all the classes in your system, and see how they are communicating.
    • Draw another one using the same classes and see how they ought to communicate.
    • Refactor the code to transform the first diagram to the second. (this might be messy and tricky)
    • Remember to use C++ code for all new code added.
    • If you have some time left, try replacing data structures one by one to use the more standardized STL or Boost.
    0 讨论(0)
  • 2021-01-30 17:11

    Probably two things to consider besides how you want to start are on what you want to focus, and where you want to stop.

    You state that there is a large code churn, this may be a key to focus your efforts. I suggest you pick the parts of your code where a lot of maintenance is needed, the mature/stable parts are apparently working well enough, so it is better to leave them as they are, except probably for some window dressing with facades etc.

    Where you want to stop depends on what the reason is for wanting to convert to C++. This can hardly be a goal in itself. If it is due to some 3rd party dependency, focus your efforts on the interface to that component.

    The software I work on is a huge, old code base which has been 'converted' from C to C++ years ago now. I think it was because the GUI was converted to Qt. Even now it still mostly looks like a C program with classes. Breaking the dependencies caused by public data members, and refactoring the huge classes with procedural monster methods into smaller methods and classes never has really taken off, I think for the following reasons:

    1. There is no need to change code that is working and that does not need to be enhanced. Doing so introduces new bugs without adding functionality, and end users don't appreciate that;
    2. It is very, very hard to do refactor reliably. Many pieces of code are so large and also so vital that people hardly dare touching it. We have a fairly extensive suite of functional tests, but sufficient code coverage information is hard to get. As a result, it is difficult to establish whether there are already sufficient tests in place to detect problems during refactoring;
    3. The ROI is difficult to establish. The end user will not benefit from refactoring, so it must be in reduced maintenance cost, which will increase initially because by refactoring you introduce new bugs in mature, i.e. fairly bug-free code. And the refactoring itself will be costly as well ...

    NB. I suppose you know the "Working effectively with Legacy code" book?

    0 讨论(0)
  • 2021-01-30 17:12

    Having just started on pretty much the same thing a few months ago (on a ten-year-old commercial project, originally written with the "C++ is nothing but C with smart structs" philosophy), I would suggest using the same strategy you'd use to eat an elephant: take it one bite at a time. :-)

    As much as possible, split it up into stages that can be done with minimal effects on other parts. Building a facade system, as Federico Ramponi suggested, is a good start -- once everything has a C++ facade and is communicating through it, you can change the internals of the modules with fair certainty that they can't affect anything outside them.

    We already had a partial C++ interface system in place (due to previous smaller refactoring efforts), so this approach wasn't difficult in our case. Once we had everything communicating as C++ objects (which took a few weeks, working on a completely separate source-code branch and integrating all changes to the main branch as they were approved), it was very seldom that we couldn't compile a totally working version before we left for the day.

    The change-over isn't complete yet -- we've paused twice for interim releases (we aim for a point-release every few weeks), but it's well on the way, and no customer has complained about any problems. Our QA people have only found one problem that I recall, too. :-)

    0 讨论(0)
  • 2021-01-30 17:13

    Your list looks okay except I would suggest reviewing the test suite first and trying to get that as tight as possible before doing any coding.

    0 讨论(0)
提交回复
热议问题