Integer Vs Long Confusion

前端 未结 7 1107
小鲜肉
小鲜肉 2020-11-27 18:49

I have seen many believe in the following

VBA converts all integer values to type Long

In fact, even the MSDN article says

相关标签:
7条回答
  • 2020-11-27 19:27

    “In recent versions, however, VBA converts all integer values to type Long, even if they're declared as type Integer.”

    I don't believe that documentation. Consider the following simple example (run in Excel 2010):

    Sub checkIntegerVsLong()
        
        'Check the total memory allocation for an array
        Dim bits As Integer 'or long? :)
        Dim arrInteger() As Integer
        ReDim arrInteger(1 To 5)
        arrInteger(1) = 12
        arrInteger(2) = 456
        'get total memory allocation for integer in array
        bits = VarPtr(arrInteger(2)) - VarPtr(arrInteger(1))
        Debug.Print "For integer: " & bits & " bits and " & bits * 8 & " bytes."
    
    
        Dim arrLong() As Long
        ReDim arrLong(1 To 5)
        arrLong(1) = 12
        arrLong(2) = 456
        
        'get memory allocation for long
        bits = VarPtr(arrLong(2)) - VarPtr(arrLong(1))
        Debug.Print "For long: " & bits & " bits and " & bits * 8 & " bytes."
    
    End Sub
    

    This prints:

    For integer: 2 bits and 16 bytes.

    For long: 4 bits and 32 bytes.

    You can also test this on individual variables using the following:

    Sub testIndividualValues()
        
        Dim j As Long
        Dim i As Integer
        Dim bits As Integer
        
        bits = LenB(i)
        Debug.Print "Length of integer: " & bits & " bits and " & bits * 8 & " bytes."
        bits = LenB(j)
        Debug.Print "Length of long: " & bits & " bits and " & bits * 8 & " bytes."
    
        
        
    End Sub
    

    which prints

    Length of integer: 2 bits and 16 bytes.

    Length of long: 4 bits and 32 bytes.

    Last, you can use a type comparison here:

    Public Type myIntegerType
        a As Integer
        b As Integer
    End Type
    Public Type myLongType
        a As Long
        b As Long
    End Type
    
    Public Sub testWithTypes()
        Dim testInt As myIntegerType
        Dim testLong As myLongType
        Dim bits As Integer
        
        bits = VarPtr(testInt.b) - VarPtr(testInt.a)
        Debug.Print "For integer in types: " & bits & " bits and " & bits * 8 & " bytes."
        
        bits = VarPtr(testLong.b) - VarPtr(testLong.a)
        Debug.Print "For long in types: " & bits & " bits and " & bits * 8 & " bytes."
        
    End Sub
    

    which also prints:

    For integer in types: 2 bits and 16 bytes.

    For long in types: 4 bits and 32 bytes.

    This is pretty compelling evidence to me that VBA actually does treat Integer and Long differently.

    If VBA silently converted behind the scenes, you would expect those to return the same number of bits/bytes for each of the pointer allocations locations. But in the first case, with Integer, it is only allocating 16 bits, while for Long variables, it allocates 32 bits.

    So what?

    So to your question of

    If VBA converts all Integer values to type Long even if they're declared as type Integer, then the above should never give you the Overflow error!

    It makes complete sense that you would get an Overflow error, as VBA has not actually allocated the memory for a Long to the Integer declaration.

    I would be curious too if this returns the same on all versions of Office. I can only test on Office 2010 on 64 bit Windows 7.

    0 讨论(0)
  • 2020-11-27 19:28

    As far as my testing goes, A VBA integer still takes two bytes (Tested on Access 2016, build 8201).

    Implicit casting to a long (and back, if it's a write operation) occurs for operations, not for storage, as far as I can find. E.g. if I do myInt + 1, myInt gets cast to a long, then one gets added to that long, and then the result is cast back to an int, resulting in a performance loss compared to just using a Long. So while it costs less memory to use an integer, all operations will suffer in performance.

    As Mathieu Guindon noted under Enderland/Elysian Fields's answer, testing VBA's storage with VBA functions can't prove anything, so let's go more low-level and directly view what's stored in memory, and manipulate that memory.

    First, declarations:

    Declare PtrSafe Sub CopyMemory Lib "Kernel32.dll" Alias "RtlMoveMemory" (ByVal Destination As LongPtr, ByVal Source As LongPtr, ByVal Length As Long)
    
    Public Function ToBits(b As Byte) As String
        Dim i As Integer
        For i = 7 To 0 Step -1
            ToBits = ToBits & IIf((b And 2 ^ i) = (2 ^ i), 1, 0)
        Next
    End Function
    

    Now, I'm going to prove two things:

    1. The memory VarPtr points to contains 16-bits integers
    2. Manipulating this memory manipulates the integers VBA uses, even if you manipulate it outside of VBA

    The code:

    Dim i(0 To 1) As Integer
    'Using negatives to prove things aren't longs, because of the sign bit
    i(0) = -2 ^ 15 + (2 ^ 0) '10000000 00000001
    i(1) = -2 ^ 15 + (2 ^ 1) '10000000 00000010
    Dim bytes(0 To 3) As Byte
    CopyMemory VarPtr(bytes(0)), VarPtr(i(0)), 4
    Dim l As Long
    For l = 3 To 0 Step -1
        Debug.Print ToBits(bytes(l)) & " ";
        'Prints 10000000 00000010 10000000 00000001
    Next
    'Now, let's write something back
    bytes(0) = &HFF '0xFFFF = signed -1
    bytes(1) = &HFF
    CopyMemory VarPtr(i(0)), VarPtr(bytes(0)), 2
    Debug.Print i(0) '-1
    

    So, we can be certain that VBA indeed both writes 2-byte integers to the memory, and reads them back from the memory, when we're declaring things as an integer.

    0 讨论(0)
  • 2020-11-27 19:34

    The conversion is only for memory optimization, not for user code. For the programmer, there is practically no change since the min/max limits of datatypes remain the same.

    If you take that para as a whole, you will realize that that statement is in context of performance only, and not otherwise. This is because the default size of numbers is Int32 or Int64 (depending on whether it is 32-bit or 64-bit system). The processor can process upto that big number in one go. If you declare a smaller unit than this, the compiler has to downsize it, and that needs more efforts than simply using the default type. And the processor has really no gain either. So even though you declare your variable as Integer, the compiler allocates it a Long memory, because it knows that it has to do more work without any gain.

    As a VBA programmer what is of significance for you is – Declare your variables as LONG instead of INTEGER even if you want to store small numbers in them.

    0 讨论(0)
  • 2020-11-27 19:35

    So far, I have not seen anyone mention the problem of byte alignment. In order to manipulate the numbers, it needs to be loaded into register and as a rule, a register cannot contain more than one variable. I believe that registers also need to be cleared from the previous instruction, so to ensure the variable is loaded correctly it needs to be re-aligned, which may also involve sign extending or zeroing out the register.

    You also can observe the byte alignment using VBA code:

    Public Type x
        a As Integer
        b As Integer
        l As Long
    End Type
    
    Public Type y
        a As Integer
        l As Long
        b As Integer
    End Type
    
    Public Sub test()
        Dim x As x
        Dim y As y
    
        Debug.Print LenB(x)
        Debug.Print LenB(x.a), LenB(x.b), LenB(x.l)
    
        Debug.Print LenB(y)
        Debug.Print LenB(y.a), LenB(y.l), LenB(y.b)
    End Sub
    

    Even though the UDT x and y contains same number of members and each member are same data type; with the only difference being the ordering of the members, LenB() will give different results; on a 32-bit platform, x consumes only 8 bytes whereas y will need 12 bytes. The high word between the x.a and x.l and after the x.b is simply ignored.

    The other point is that the problem is not unique to VBA. For example, C++ has the same considerations as illustrated here and here. So this is actually much lower level and thus why you can't "see" the sign-extending/zero-extending behavior when loading the variables into registers for performing the operation. To see that, you need the disassembly.

    0 讨论(0)
  • 2020-11-27 19:41

    Looking at the other answers and the MSDN documentation, I think the phrase "Internally stored" is imprecise and that's what's confusing.

    Tl;DR

    Integers aren't "stored internally" as Longs, that is they do not require the same amount of memory to save their values as a Long does. Rather they are "used internally" as Longs, meaning their value is temporarily stored in a Long variable whenever it is accessed (e.g incrementing a loop counter) before being copied back, and generally speaking, an array of Integers will require half as much memory as an array of Longs.


    @enderland's answer shows that the memory layout of Integers, Integer Arrays and UDTs composed of Integers such as a DWORD all conform to the idea that the value contained in a variable declared as an integer takes up 2 bytes of memory.

    This is from the viewpoint of VBA code, meaning it's possible to assume that

    1. The memory locations and sizes given by VarPtr and LenB respectively are incorrect (lies) in the case of Integers to avoid breaking existing code when the switch from 16 to 32 bit systems took place
    2. There is some sort of abstraction layer which means the memory appears as one thing but is actually another

    We can rule these both out.

    It is possible to use the CopyMemory API with an address given by the VarPtr and a width given by LenB to overwrite values in an array directly. The API is not under VBA's control, and all it does is directly write bits to memory. The fact that this is possible at all means that VarPtr must point to an area in memory where LenB bytes are used to store the value of that Integer; no other way around it, 2 bytes is the amount of space used to encode an Integer's value.

    The abstraction layer could still be true though; VBA could hold one array of 2 byte spaced memory (SAFEARRAYS are all consecutive memory, that's why CopyMemory can write 2 entries at once) where VarPtr points to. Meanwhile a separate 4 byte spaced block of memory shadows the 2 byte spaced block, staying constantly in sync so that Integers can be stored as Longs. Sounds weird but could happen right?


    It doesn't, and we can see this by looking at the process memory in Task Manager:

    Idle, Excel uses 155,860KB of memory (155,860 * 1024 bytes)

    Run this:

    Sub testLongs()
        Dim longs(500, 500, 500) As Long
        Stop
    End Sub
    

    ...and it spikes to 647,288KB. Taking the difference and dividing by the number of array elements gives ~4.03 bytes per Long. The same test for Integers:

    Sub t()
        Dim ints(500, 500, 500) As Integer
        Stop
    End Sub
    

    ...gives 401,548KB, or ~2.01 bytes per Integer

    There will be a slight variation in the idle memory usage so the exact numbers don't matter, but clearly the Integer array is using ~ half the memory of the Long array


    So my interpretation of the MSDN article is the following:

    Memory-wise, Integers are really stored as 2 byte values, not as 4 byte Longs. There is no abstraction or trickery with pointers to hide this from us.

    Rather the article tells us that when Integers are used in operations (multiplication/addition etc.) their values are fist copied to the lower half of an int32 / VBA Long, the calculation takes place in an optimized 32-bit friendly way, and then the result is copied back to Integer and overflow errors are raised as necessary. For Longs there is no need to copy forward and back (hence the recommendation).

    0 讨论(0)
  • 2020-11-27 19:47

    An integer declared as an Integer is still type checked as an Integer. The msdn documentation is referencing how the variable is stored internally. On a 32 bit system, an Integer will be stored in 32 BITS not Bytes, while on a 16 bit system the value is stored in a 16 BIT space or register, it would have been stored in 16. Hence the maximum size.

    There is no type conversion going on as far as VBA is concerned. An int is an int and a long is a long, even though they now take up just as much space.

    0 讨论(0)
提交回复
热议问题