C++ Extract number from the middle of a string

天涯浪子 提交于 2019-11-28 09:40:54

updated for C++11 2018-12-04

I tried to update this answer to use C++11 on my machine but failed because my g++ compiler does not have full <regex> support... so I kept getting uncaught std::regex_error code=4 (i.e. "missing bracket") exceptions for any regex with bracket character classes std::regex("[0-9]").

Apparently full support for C++11 <regex> was implemented and released for g++ version 4.9.x and on Jun 26, 2015. Hat tip to SO questions #1 and #2 for figuring out the compiler version needing to be 4.9.x.

Here is the C++11 code that should work but I wasn't able to test it:

#include <iostream>
#include <string>
#include <regex>

using std::cout;
using std::endl;

int main() {
    std::string input = "Example_45-3";
    std::string output = std::regex_replace(
        input,
        std::regex("[^0-9]*([0-9]+).*"),
        std::string("\\1")
        );
    cout << input << endl;
    cout << output << endl;
}

boost solution that only requires C++98

Minimal implementation example that works on many strings (not just strings of the form "text_45-text":

#include <iostream>
#include <string>
using namespace std;
#include <boost/regex.hpp>

int main() {
    string input = "Example_45-3";
    string output = boost::regex_replace(
        input,
        boost::regex("[^0-9]*([0-9]+).*"),
        string("\\1")
        );
    cout << input << endl;
    cout << output << endl;
}

console output:

Example_45-3
45

Other example strings that this would work on:

  • "asdfasdf 45 sdfsdf"
  • "X = 45, sdfsdf"

For this example I used g++ on Linux with #include <boost/regex.hpp> and -lboost_regex. You could also use C++11x regex.

Feel free to edit my solution if you have a better regex.


Commentary:

If there aren't performance constraints, using Regex is ideal for this sort of thing because you aren't reinventing the wheel (by writing a bunch of string parsing code which takes time to write/test-fully).

Additionally if/when your strings become more complex or have more varied patterns regex easily accommodates the complexity. (The question's example pattern is easy enough. But often times a more complex pattern would take 10-100+ lines of code when a one line regex would do the same.)

You can also use the built in find_first_of and find_first_not_of to find the first "numberstring" in any string.

std::string first_numberstring(std::string const & str)
{
  std::size_t const n = str.find_first_of("0123456789");
  if (n != std::string::npos)
  {
    std::size_t const m = str.find_first_not_of("0123456789", n);
    return str.substr(n, m != std::string::npos ? m-n : m);
  }
  return std::string();
}
Lingxi

This should be more efficient than Ashot Khachatryan's solution. Note the use of '_' and '-' instead of "_" and "-". And also, the starting position of the search for '-'.

inline std::string mid_num_str(const std::string& s) {
    std::string::size_type p  = s.find('_');
    std::string::size_type pp = s.find('-', p + 2); 
    return s.substr(p + 1, pp - p - 1);
}

If you need a number instead of a string, like what Alexandr Lapenkov's solution has done, you may also want to try the following:

inline long mid_num(const std::string& s) {
    return std::strtol(&s[s.find('_') + 1], nullptr, 10);
}

Check this out

std::string ex = "Example_45-3";
int num;
sscanf( ex.c_str(), "%*[^_]_%d", &num );

I can think of two ways of doing it:

  • Use regular expressions
  • Use an iterator to step through the string, and copy each consecutive digit to a temporary buffer. Break when it reaches an unreasonable length or on the first non-digit after a string of consecutive digits. Then you have a string of digits that you can easily convert.
std::string s = "Example_45-3";
int p1 = s.find("_");
int p2 = s.find("-");
std::string number = s.substr(p1 + 1, p2 - p1 - 1)

The 'best' way to do this in C++11 and later is probably using regular expressions, which combine high expressiveness and high performance when the test is repeated often enough.

The following code demonstrates the basics. You should #include <regex> for it to work.

// The example inputs
std::vector<std::string> inputs {
    "Example_0-0", "Example_0-1", "Example_0-2", "Example_0-3", "Example_0-4",
    "Example_1-0", "Example_1-1", "Example_1-2", "Example_1-3", "Example_1-4"
};

// The regular expression. A lot of the cost is incurred when building the
// std::regex object, but when it's reused a lot that cost is amortised.
std::regex imgNumRegex { "^[^_]+_([[:digit:]]+)-([[:digit:]]+)$" };

for (const auto &input: inputs){
    // This wil contain the match results. Parts of the regular expression
    // enclosed in parentheses will be stored here, so in this case: both numbers
    std::smatch matchResults;

    if (!std::regex_match(input, matchResults, imgNumRegex)) {
        // Handle failure to match
        abort();
    }

    // Note that the first match is in str(1). str(0) contains the whole string
    std::string theFirstNumber = matchResults.str(1);
    std::string theSecondNumber = matchResults.str(2);

    std::cout << "The input had numbers " << theFirstNumber;
    std::cout << " and " << theSecondNumber << std::endl;
}

Using @Pixelchemist's answer and e.g. std::stoul:

bool getFirstNumber(std::string const & a_str, unsigned long & a_outVal)
{
    auto pos = a_str.find_first_of("0123456789");

    try
    {
        if (std::string::npos != pos)
        {
            a_outVal = std::stoul(a_str.substr(pos));

            return true;
        }
    }
    catch (...)
    {
        // handle conversion failure
        // ...
    }

    return false;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!