I have to format std::string with sprintf and send it into file stream. How can I do this?
Tested, Production Quality Answer
This answer handles the general case with standards compliant techniques. The same approach is given as an example on CppReference.com near the bottom of their page. Unlike their example, this code fits the question's requirements and is field tested in robotics and satellite applications. It also has improved commenting. Design quality is discussed further below.
#include <string>
#include <cstdarg>
#include <vector>
// requires at least C++11
const std::string vformat(const char * const zcFormat, ...) {
// initialize use of the variable argument array
va_list vaArgs;
va_start(vaArgs, zcFormat);
// reliably acquire the size
// from a copy of the variable argument array
// and a functionally reliable call to mock the formatting
va_list vaArgsCopy;
va_copy(vaArgsCopy, vaArgs);
const int iLen = std::vsnprintf(NULL, 0, zcFormat, vaArgsCopy);
va_end(vaArgsCopy);
// return a formatted string without risking memory mismanagement
// and without assuming any compiler or platform specific behavior
std::vector<char> zc(iLen + 1);
std::vsnprintf(zc.data(), zc.size(), zcFormat, vaArgs);
va_end(vaArgs);
return std::string(zc.data(), iLen); }
#include <ctime>
#include <iostream>
#include <iomanip>
// demonstration of use
int main() {
std::time_t t = std::time(nullptr);
std::cerr
<< std::put_time(std::localtime(& t), "%D %T")
<< " [debug]: "
<< vformat("Int 1 is %d, Int 2 is %d, Int 3 is %d", 11, 22, 33)
<< std::endl;
return 0; }
Predictable Linear Efficiency
Two passes are necessities for a secure, reliable, and predictable reusable function per the question specifications. Presumptions about the distribution of sizes of vargs in a reusable function is bad programming style and should be avoided. In this case, arbitrarily large variable length representations of vargs is a key factor in choice of algorithm.
Retrying upon overflow is exponentially inefficient, which is another reason discussed when the C++11 standards committee discussed the above proposal to provide a dry run when the write buffer is null.
In the above production ready implementation, the first run is such a dry run to determine allocation size. No allocation occurs. Parsing of printf directives and the reading of vargs has been made extremely efficient over decades. Reusable code should be predictable, even if a small inefficiency for trivial cases must be sacrificed.
Security and Reliability
Andrew Koenig said to a small group of us after his lecture at a Cambridge event, "User functions shouldn't rely on the exploitation of a failure for unexceptional functionality." As usual, his wisdom has been shown true in the record since. Fixed and closed security bug issues often indicate retry hacks in the description of the hole exploited prior to the fix.
This is mentioned in the formal standards revision proposal for the null buffer feature in Alternative to sprintf, C9X Revision Proposal, ISO IEC Document WG14 N645/X3J11 96-008. An arbitrarily long string inserted per print directive, "%s," within the constraints of dynamic memory availability, is not an exception, and should not be exploited to produce, "Unexceptional functionality."
Consider the proposal along side the example code given at the bottom of the C++Reference.org page linked to in the first paragraph of this answer.
Also, the testing of failure cases is rarely as robust of success cases.
Portability
All major O.S. vendors provide compilers that fully support std::vsnprintf as part of the c++11 standards. Hosts running products of vendors that no longer maintain distributions should be furnished with g++ or clang++ for many reasons.
Stack Use
Stack use in the 1st call to std::vsnprintf will be less than or equal to that of the 2nd, and and it will be freed before the 2nd call begins. If the first call exceeds stack availability, then std::fprintf would fail too.
UPDATE 1: added fmt::format
tests
I've took my own investigation around methods has introduced here and gain diametrically opposite results versus mentioned here.
I have used 4 functions over 4 methods:
vsnprintf
+ std::unique_ptr
vsnprintf
+ std::string
std::ostringstream
+ std::tuple
+ utility::for_each
fmt::format
function from fmt
libraryFor the test backend the googletest
has used.
#include <string>
#include <cstdarg>
#include <cstdlib>
#include <memory>
#include <algorithm>
#include <fmt/format.h>
inline std::string string_format(size_t string_reserve, const std::string fmt_str, ...)
{
size_t str_len = (std::max)(fmt_str.size(), string_reserve);
// plain buffer is a bit faster here than std::string::reserve
std::unique_ptr<char[]> formatted;
va_list ap;
va_start(ap, fmt_str);
while (true) {
formatted.reset(new char[str_len]);
const int final_n = vsnprintf(&formatted[0], str_len, fmt_str.c_str(), ap);
if (final_n < 0 || final_n >= int(str_len))
str_len += (std::abs)(final_n - int(str_len) + 1);
else
break;
}
va_end(ap);
return std::string(formatted.get());
}
inline std::string string_format2(size_t string_reserve, const std::string fmt_str, ...)
{
size_t str_len = (std::max)(fmt_str.size(), string_reserve);
std::string str;
va_list ap;
va_start(ap, fmt_str);
while (true) {
str.resize(str_len);
const int final_n = vsnprintf(const_cast<char *>(str.data()), str_len, fmt_str.c_str(), ap);
if (final_n < 0 || final_n >= int(str_len))
str_len += (std::abs)(final_n - int(str_len) + 1);
else {
str.resize(final_n); // do not forget to shrink the size!
break;
}
}
va_end(ap);
return str;
}
template <typename... Args>
inline std::string string_format3(size_t string_reserve, Args... args)
{
std::ostringstream ss;
if (string_reserve) {
ss.rdbuf()->str().reserve(string_reserve);
}
std::tuple<Args...> t{ args... };
utility::for_each(t, [&ss](auto & v)
{
ss << v;
});
return ss.str();
}
The for_each
implementation is taken from here: iterate over tuple
#include <type_traits>
#include <tuple>
namespace utility {
template <std::size_t I = 0, typename FuncT, typename... Tp>
inline typename std::enable_if<I == sizeof...(Tp), void>::type
for_each(std::tuple<Tp...> &, const FuncT &)
{
}
template<std::size_t I = 0, typename FuncT, typename... Tp>
inline typename std::enable_if<I < sizeof...(Tp), void>::type
for_each(std::tuple<Tp...> & t, const FuncT & f)
{
f(std::get<I>(t));
for_each<I + 1, FuncT, Tp...>(t, f);
}
}
The tests:
TEST(ExternalFuncs, test_string_format_on_unique_ptr_0)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = string_format(0, "%s+%u\n", "test test test", 12345);
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_unique_ptr_256)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = string_format(256, "%s+%u\n", "test test test", 12345);
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_std_string_0)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = string_format2(0, "%s+%u\n", "test test test", 12345);
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_std_string_256)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = string_format2(256, "%s+%u\n", "test test test", 12345);
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_string_stream_on_variadic_tuple_0)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = string_format3(0, "test test test", "+", 12345, "\n");
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_string_stream_on_variadic_tuple_256)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = string_format3(256, "test test test", "+", 12345, "\n");
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_string_stream_inline_0)
{
for (size_t i = 0; i < 1000000; i++) {
std::ostringstream ss;
ss << "test test test" << "+" << 12345 << "\n";
const std::string v = ss.str();
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_string_format_on_string_stream_inline_256)
{
for (size_t i = 0; i < 1000000; i++) {
std::ostringstream ss;
ss.rdbuf()->str().reserve(256);
ss << "test test test" << "+" << 12345 << "\n";
const std::string v = ss.str();
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_fmt_format_positional)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = fmt::format("{0:s}+{1:d}\n", "test test test", 12345);
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
TEST(ExternalFuncs, test_fmt_format_named)
{
for (size_t i = 0; i < 1000000; i++) {
const std::string v = fmt::format("{first:s}+{second:d}\n", fmt::arg("first", "test test test"), fmt::arg("second", 12345));
UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(v);
}
}
The UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR
.
unsued.hpp:
#define UTILITY_SUPPRESS_OPTIMIZATION_ON_VAR(var) ::utility::unused_param(&var)
namespace utility {
extern const volatile void * volatile g_unused_param_storage_ptr;
extern void
#ifdef __GNUC__
__attribute__((optimize("O0")))
#endif
unused_param(const volatile void * p);
}
unused.cpp:
namespace utility {
const volatile void * volatile g_unused_param_storage_ptr = nullptr;
void
#ifdef __GNUC__
__attribute__((optimize("O0")))
#endif
unused_param(const volatile void * p)
{
g_unused_param_storage_ptr = p;
}
}
RESULTS:
[ RUN ] ExternalFuncs.test_string_format_on_unique_ptr_0
[ OK ] ExternalFuncs.test_string_format_on_unique_ptr_0 (556 ms)
[ RUN ] ExternalFuncs.test_string_format_on_unique_ptr_256
[ OK ] ExternalFuncs.test_string_format_on_unique_ptr_256 (331 ms)
[ RUN ] ExternalFuncs.test_string_format_on_std_string_0
[ OK ] ExternalFuncs.test_string_format_on_std_string_0 (457 ms)
[ RUN ] ExternalFuncs.test_string_format_on_std_string_256
[ OK ] ExternalFuncs.test_string_format_on_std_string_256 (279 ms)
[ RUN ] ExternalFuncs.test_string_format_on_string_stream_on_variadic_tuple_0
[ OK ] ExternalFuncs.test_string_format_on_string_stream_on_variadic_tuple_0 (1214 ms)
[ RUN ] ExternalFuncs.test_string_format_on_string_stream_on_variadic_tuple_256
[ OK ] ExternalFuncs.test_string_format_on_string_stream_on_variadic_tuple_256 (1325 ms)
[ RUN ] ExternalFuncs.test_string_format_on_string_stream_inline_0
[ OK ] ExternalFuncs.test_string_format_on_string_stream_inline_0 (1208 ms)
[ RUN ] ExternalFuncs.test_string_format_on_string_stream_inline_256
[ OK ] ExternalFuncs.test_string_format_on_string_stream_inline_256 (1302 ms)
[ RUN ] ExternalFuncs.test_fmt_format_positional
[ OK ] ExternalFuncs.test_fmt_format_positional (288 ms)
[ RUN ] ExternalFuncs.test_fmt_format_named
[ OK ] ExternalFuncs.test_fmt_format_named (392 ms)
As you can see implementation through the vsnprintf
+std::string
is equal to fmt::format
, but faster than through the vsnprintf
+std::unique_ptr
, which is faster than through the std::ostringstream
.
The tests compiled in Visual Studio 2015 Update 3
and run at Windows 7 x64 / Intel Core i7-4820K CPU @ 3.70GHz / 16GB
.
Poco Foundation library has a very convenient format function, which supports std::string in both the format string and the values:
I wrote my own using vsnprintf so it returns string instead of having to create my own buffer.
#include <string>
#include <cstdarg>
//missing string printf
//this is safe and convenient but not exactly efficient
inline std::string format(const char* fmt, ...){
int size = 512;
char* buffer = 0;
buffer = new char[size];
va_list vl;
va_start(vl, fmt);
int nsize = vsnprintf(buffer, size, fmt, vl);
if(size<=nsize){ //fail delete buffer and try again
delete[] buffer;
buffer = 0;
buffer = new char[nsize+1]; //+1 for /0
nsize = vsnprintf(buffer, size, fmt, vl);
}
std::string ret(buffer);
va_end(vl);
delete[] buffer;
return ret;
}
So you can use it like
std::string mystr = format("%s %d %10.5f", "omg", 1, 10.5);
template<typename... Args>
std::string string_format(const char* fmt, Args... args)
{
size_t size = snprintf(nullptr, 0, fmt, args...);
std::string buf;
buf.reserve(size + 1);
buf.resize(size);
snprintf(&buf[0], size + 1, fmt, args...);
return buf;
}
Using C99 snprintf and C++11
Took the idea from Dacav and pixelpoint's answer. I played around a bit and got this:
#include <cstdarg>
#include <cstdio>
#include <string>
std::string format(const char* fmt, ...)
{
va_list vl;
va_start(vl, fmt);
int size = vsnprintf(0, 0, fmt, vl) + sizeof('\0');
va_end(vl);
char buffer[size];
va_start(vl, fmt);
size = vsnprintf(buffer, size, fmt, vl);
va_end(vl);
return std::string(buffer, size);
}
With sane programming practice I believe the code should be enough, however I'm still open to more secure alternatives that are still simple enough and would not require C++11.
And here's another version that makes use of an initial buffer to prevent second call to vsnprintf()
when initial buffer is already enough.
std::string format(const char* fmt, ...)
{
va_list vl;
int size;
enum { INITIAL_BUFFER_SIZE = 512 };
{
char buffer[INITIAL_BUFFER_SIZE];
va_start(vl, fmt);
size = vsnprintf(buffer, INITIAL_BUFFER_SIZE, fmt, vl);
va_end(vl);
if (size < INITIAL_BUFFER_SIZE)
return std::string(buffer, size);
}
size += sizeof('\0');
char buffer[size];
va_start(vl, fmt);
size = vsnprintf(buffer, size, fmt, vl);
va_end(vl);
return std::string(buffer, size);
}
(It turns out that this version is just similar to Piti Ongmongkolkul's answer, only that it doesn't use new
and delete[]
, and also specifies a size when creating std::string
.
The idea here of not using new
and delete[]
is to imply usage of the stack over the heap since it doesn't need to call allocation and deallocation functions, however if not properly used, it could be dangerous to buffer overflows in some (perhaps old, or perhaps just vulnerable) systems. If this is a concern, I highly suggest using new
and delete[]
instead. Note that the only concern here is about the allocations as vsnprintf()
is already called with limits, so specifying a limit based on the size allocated on the second buffer would also prevent those.)