Precisions and counts

后端 未结 1 470
北海茫月
北海茫月 2021-01-29 05:00

I am working with a educational dataset called IPEDS from the National Center for Educational Statistics. They track students in college based upon major, degree completion, etc

1条回答
  •  说谎
    说谎 (楼主)
    2021-01-29 05:14

    This minimal example confirms your problem. (See, by the way, https://stackoverflow.com/help/mcve for advice on good examples.)

    * code 
    clear
    input code 
    14.2501 
    14.2501 
    14.2501 
    end 
    
    tab code if code == 14.2501
    tab code if code == float(14.2501)
    
    * results 
    . tab code if code == 14.2501
    no observations
    
    . tab code if code == float(14.2501)
    
           code |      Freq.     Percent        Cum.
    ------------+-----------------------------------
        14.2501 |          3      100.00      100.00
    ------------+-----------------------------------
          Total |          3      100.00
    

    The keyword is one you use, precision. In Stata, search precision for resources, starting with blog posts by William Gould. A decimal like 14.2501 is hard (impossible) to hold exactly in binary and the details of holding a variable as type float can bite.

    It's hard to see what you're doing with your last block of code, which you don't explain. The last statement looks puzzling, as you're adding strings. Consider what happens with

    . gen whatever =  "14.2501" + "14.3901" + "15.0999" + "40.0601"
    
    . di whatever[1]
    14.250114.390115.099940.0601
    

    The result is a long string that cannot be a valid cipcode. I suspect that you are reaching towards

     ... if inlist(cipcode_str, "14.2501", "14.3901", "15.0999", "40.0601") 
    

    which is quite different.

    But using float() is the minimal trick for this problem.

    0 讨论(0)
提交回复
热议问题