C++17 will include std::byte, a type for one atomically-addressable unit of memory, having 8 bits on typical computers.
Before this standardization, there is already a b
(This is a potential rule of thumb which comes off the top of my head, not condoned by anyone.)
char *
for sequences of textual characters, not anything else.void *
in type-erasure scenarios, i.e. when the pointed-to data is typed, but for some reason a typed pointer must not be used or it cannot be determined whether it's typed or not.byte *
for raw memory for which there is no indication of it holding any typed data.An exception to the above:
void *
/unsigned char *
/char *
when older or non-C++ forces you and you would otherwise use byte *
- but wrap that with a byte *
-based interface as tightly as you can rather than exposing it to the rest of your C++ code.void * my_custom_malloc(size_t size)
- wrong
byte * my_custom_malloc(size_t size)
- right
struct buffer_t { byte* data; size_t length; my_type_t data_type; }
- wrong
struct buffer_t { void* data; size_t length; my_type_t data_type; }
- right
First, void *
still makes sense when you have to use a C library function or generally speaking to use any other extern "C"
compatible function.
Next a std::byte
array still allows individual access to any of its elements. Said differently this is legal:
std::byte *arr = ...;
arr[i] = std::byte{0x2a};
It makes sense if you want to be able to allow that low level access, for example if you want to manually copy all or parts of the array.
On the other hand, void *
is really an opaque pointer, in the sense that you will have to cast it (to a char
or byte
) before being able to access its individual elements.
So my opinion is that std::byte
should be used as soon as you want to be able to address elements of an array or move a pointer, and void *
still makes sense to denote an opaque zone that will only be passed (hard to actually process a void *
) as a whole.
But real use case for void *
should become more and more unusual in modern C++ at least at high level, because those opaque zones should normally be hidden in higher level classes coming with methods to process them. So IMHO void *
should in the end be limited to C (and older C++ versions) compatibiliy, and low level code (such as allocating code).
std::byte
is not just about "raw memory", it is byte-addressable raw memory with bitwise operations defined for it.
You should not use std::byte
to just blindly replace void*
. void*
retains its use. void*
means to the handling code "this is a block of data but I don't know what this data is, nor do I know how to operate on it.
Use std::byte
when you need byte address of the memory block and only bitwise operations defined for operating on that data.
std::byte
does not have regular basic math operations defined, such as operator+
, operator-
or operator*
. That's right, the following code is illegal:
std::byte a{0b11},b{0b11000};
std::byte c = a+b; // fails, operator+ not defined for std::byte
In other words, use void*
when it is not the handling codes business the contents.
Like I said above, all you can do on a std::byte
are the bitwise operations like |
, &
and ~
. An example of the use of std::byte
follows, note you need a C++17 compiler to compile this example, there is one here but you must select C++17 from the dropdown at the top right
#include <iostream>
#include <cstddef>
#include <bitset>
using namespace std;
void print(const byte& b)
{
bitset<8> p( to_integer<int>( b ) );
cout << p << endl;
}
int main()
{
byte a{0b11},b{0b11000};
byte c=a|b;
//byte d = a+b; // fails
print(c);
return 0;
}
What is the motivation for std::byte
?
Quoting from the original paper;
Many programs require byte-oriented access to memory. Today, such programs must use either the
char
,signed char
, orunsigned char
types for this purpose. However, these types perform a “triple duty”. Not only are they used for byte addressing, but also as arithmetic types, and as character types. This multiplicity of roles opens the door for programmer error - such as accidentally performing arithmetic on memory that should be treated as a byte value - and confusion for both programmers and tools.
In essence, std::byte
is there to "replace" the use of char
-types when required to deal with raw memory as bytes, it would be safe to assert that this is applicable when used by-value, by-refernce, pointers and in containers.
std::byte
does not have the same connotations as a char; it's about raw memory, not characters
Correct, so std::byte
should be preferred over char
-types when dealing with bytes in memory (as in, an array of bytes). Some lower level protocol data manipulation immediately comes to mind.
What is a good rule of thumb, for the days of
std::byte
, regarding when to prefer it overvoid *
and when it's the other way around?
I would argue that similar guides apply now as they did previously. When dealing with raw blocks of memory, where the byte addressability is required, char *
etc. would have been preferred over void *
, I think the same logic applies now, but prefer byte *
over char *
. A char *
is better for a character sequences.
If the desire is to pass around a pointer opaquely, the void *
probably still best fits the problem. void *
essentially means "point to anything", but the anything is still something, we are just not saying what yet.
Further, the types uintptr_t (and intptr_t
) would probably factor in as alternatives, depending of course on the desired application.
... I mostly mean new code where you get to choose all the types.
New code generally has very limited use of void *
outside of compatibility (where you don't get to choose the type). If you need byte based processing then favour byte *
.