par2 has a small and fairly clean C++ codebase, which I think builds fine on GNU/Linux, OS X, and Windows (with MSVC++).
I\'d like to incorporate an x86-64 asm versi
While I have no good solution to remove the dependency on a particular assembler I do have a suggestion on how to deal the two difference 64-bit calling conventions: Microsoft x64 versus SysV ABI.
The lowest commen denominator is the Microsoft x64 calling conventions since it can only pass the first four values by register. So if you limit yourself to this and use macros to define the registers you can easily make your code compile for both Unix (Linux/BSD/OSX) and Windows.
For example look in the file strcat64.asm
in Agner Fog's asmlib
%IFDEF WINDOWS
%define Rpar1 rcx ; function parameter 1
%define Rpar2 rdx ; function parameter 2
%define Rpar3 r8 ; function parameter 3
%ENDIF
%IFDEF UNIX
%define Rpar1 rdi ; function parameter 1
%define Rpar2 rsi ; function parameter 2
%define Rpar3 rdx ; function parameter 3
%ENDIF
push Rpar1 ; dest
push Rpar2 ; src
call A_strlen ; length of dest
push rax ; strlen(dest)
mov Rpar1, [rsp+8] ; src
call A_strlen ; length of src
pop Rpar1 ; strlen(dest)
pop Rpar2 ; src
add Rpar1, [rsp] ; dest + strlen(dest)
lea Rpar3, [rax+1] ; strlen(src)+1
call A_memcpy ; copy
pop rax ; return dest
ret
;A_strcat ENDP
I don't think four registers is really a limitation because if you're writing something in assembly it's because you want the best efficiency in which case the function calling overhead should be negligible compare to the function itself so pushing/popping some values to/from the stack if you need to when calling the function should not make a difference in performance.