How to implement MUL using all the other instructions in assembly?

前端未结

关注

 2  655

Say I have implemented all the ADD, AND, SHF, JUMP, BR, LDW, LDB(load word load byte...) ........except MUL (multiple) instructions in an assembly machine. Now I want to write a

相关标签:

2条回答

攒了一身酷

2021-01-22 07:28
The general idea is the same as you (should have) learned in school when you did "long multiplication", except we do it in binary instead of decimal. Consider the two examples below:
```
      1010        1234
    x 1100      x 2121
----------   ---------
      0000        1234
     0000        2468
    1010        1234
 + 1010      + 2468
 ---------   ---------
   1111000     2617314  
```
The example on the right is base-10 (decimal) and the example on the left is binary. Observe that the only digits you must multiply the top factor by is either 0 or 1. Multiplying by zero is easy, the answer is always zero, you don't even have to worry about adding that in. Multiplying by one is also easy, it just a matter of knowing "how far over to shift it". But that is easy, it as far over as you had to look to check that bit.

Start with a 16-bit working copy of your number, and a 16-bit accumulator set to zero. Shift the top number over and any time there is a one in the right-most digit you add the "working copy" to the accumulator. Whether or not there is a one or zero, you need to shift the "working copy" to the left one bit. When the "top" gets to zero you know you are done and the answer is in the accumulator.

There are some optimizations you can use so that you don't need as many 16-bit registers (or 8-bit register pairs), but I'll leave you to work out the details.
0 讨论(0)
发布评论:

提交评论
- 加载中...
南笙

2021-01-22 07:34
Seems you are using 8/16-bit processor similar to 8080, 6502, 6800 and analogs. Yep, a 8-iteration cycle of shifts and adds are enough and almost optimal. OTOH, if you have 1020 bytes for a constant table, the approach using the following formula could be the fastest one:
```
a*b = square(a+b)/4 - square(a-b)/4
```
If the arguments are unsigned, max of a+b is 510. You need to keep only integer parts of x**2/4 for any x, because fractional ones in the formula will compensate each other; so, the mapping is: 0 -> 0, 1 -> 0, 2 -> 1, 3 -> 2, 4 -> 4, ..., 510 -> 65025. For signed arguments, the table is two times smaller.

There are many other approaches for fast multiplication, including almost linear cost; see e.g. Donald Knuth's "The Art of Computer Programming" legendary book series, volume 2. But all they have too huge overhead in case of 8-bit arguments.
0 讨论(0)
发布评论:

提交评论
- 加载中...