What is the preferred way to implement the Add trait efficiently for Vector type

前端 未结 1 547
北海茫月
北海茫月 2021-01-23 06:18

The Add trait is defined as seen in the documentation.

When implementing it for a Vector, it was required to copy it into the add method to allow syntax lik

1条回答
  •  离开以前
    2021-01-23 06:22

    Since your Vector only contains three values implementing Float trait (which means that they are either f64 or f32) you shouldn't really bother that they are copied unless you have profiled your program and determined that multiple copies cause performance drop.

    If your type was not copyable and required allocations on construction (like big integers and big floats, for example), you could implement all possible combinations of by-value and by-reference invocations:

    impl Add for YourType { ... }
    impl<'r> Add for &'r YourType { ... }
    impl<'a> Add<&'a YourType> for YourType { ... }
    impl<'r, 'a> Add<&'a YourType> for &'r YourType { ... }
    

    and reuse the allocated storage in implementations which accept at least one argument by value. In that case, however, you will need to use & operator if you don't want to move your values into the call. Rust prefers explicit over implicit; if you need reference semantics, you have to write it explicitly.

    FWIW, you can take a look at this program and especially its assembly output. This piece of assembly, I believe, is responsible for all arithmetic operations:

    shrq    $11, %r14
    cvtsi2sdq   %r14, %xmm0
    mulsd   .LCPI0_0(%rip), %xmm0
    shrq    $11, %r15
    cvtsi2sdq   %r15, %xmm1
    mulsd   .LCPI0_0(%rip), %xmm1
    shrq    $11, %rbx
    cvtsi2sdq   %rbx, %xmm2
    mulsd   .LCPI0_0(%rip), %xmm2
    movaps  %xmm0, %xmm3
    addsd   %xmm1, %xmm3
    movaps  %xmm1, %xmm4
    addsd   %xmm2, %xmm4
    movaps  %xmm0, %xmm5
    addsd   %xmm2, %xmm5
    addsd   %xmm2, %xmm3
    addsd   %xmm0, %xmm4
    addsd   %xmm1, %xmm5
    movsd   %xmm3, 24(%rsp)
    movsd   %xmm4, 32(%rsp)
    movsd   %xmm5, 40(%rsp)
    leaq    (%rsp), %rdi
    leaq    24(%rsp), %rsi
    callq   _ZN13Vec3$LT$T$GT$9to_string20h7039822990634233867E
    

    Looks neat to me - the compiler has inlined all operations very nicely.

    0 讨论(0)
提交回复
热议问题