I spent hours to dig the behavior, first about those questions:
I digged deeper and deeper. Now I think it's clear about these facts:
With O_APPEND, parallel write(2) from multiple processes is ok. It's just order of lines are undetermined, but lines don't interleave or overwrite each other. And the size of data is by any amount, according to Niall Douglas's answer for Understanding concurrent file writes from multiple processes. I have tested about this "by any amount" on linux and have not found the upper limit, so i guess it's right.
Without O_APPEND, it will be a mess. Here is what POSIX says "This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control."
Now we come into python. The test i did in EDIT3, that 8K, i found it's origin. Python's write() use fwrite(3) in fact, and my python set a BUFF_SIZE here which is 8192. According to an answer from abarnert in Default buffer size for a file on Linux. This 8192 has a long story.
However, more information is welcome.