I need an MPI C code to write data to a binary file via MPI I/O. I need process 0 to write a short header, then I need the whole range of processes to write their own pieces of
Your approach is fine and if you need something right now to put bits in a file, go ahead and call yourself done.
Here are some suggestions for more efficiency:
You can consult the status object for how many bytes were written, instead of getting the position and translating into bytes.
If you have the memory to hold all the data before you write, you could describe your I/O with an MPI datatype (admittedly, one that might end up being a pain to create). Then all processes would issue a single collective call.
You should use collective I/O instead of independent I/O. A "quality library" should be able to give you equal if not better performance (and if not, you could raise the issue with your MPI implementation).
If the processes have different amounts of data to write, MPI_EXSCAN is a good way to collect who has what data. Then you can call MPI_FILE_WRITE_AT_ALL to the correct offset in the file.