ICD-9 Code List in XML, CSV, or Database format [closed]

烈酒焚心 提交于 2019-11-28 16:17:58

问题


I am looking for a complete list of ICD-9 Codes (Medical Codes) for Diseases and Procedures in a format that can be imported into a database and referenced programmatically. My question is basically exactly the same as Looking for resources for ICD-9 codes, but the original poster neglected to mention where exactly he "got ahold of" his complete list.

Google is definitely not my friend here as I have spent many hours googling the problem and have found many rich text type lists (such as the CDC) or websites where I can drill down to the complete list interactively, but I cannot find where to get the list that would populate these websites and can be parsed into a Database. I believe the files here ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/ have what I am looking for but the files are rich text format and contain a lot of garbage and formatting that would be difficult to remove accurately.

I know this has to have been done by others and I am trying to avoid duplicating other peoples effort but I just cannot find an xml/CSV/Excel list.


回答1:


Centers for Medicaid & Medicare services provides excel files which contain just the codes and diagnosis, which can be imported directly into some SQL databases, sans conversion.

Zipped Excel files, by version number

(Update: New link based on comment below)




回答2:


After removing the RTF it wasn't too hard to parse the file and turn it into a CSV. My resulting parsed files containing all 2009 ICD-9 codes for Diseases and Procedures are here: http://www.jacotay.com/files/Disease_and_ProcedureCodes_Parsed.zip My parser that I wrote is here: http://www.jacotay.com/files/RTFApp.zip Basically it is a two step process - take the files from the CDC FTP site, and remove the RTF from them, then select the RTF-free files and parse them into the CSV files. The code here is pretty rough because I only needed to get the results out once.

Here is the code for the parsing app in case the external links go down (back end to a form that lets you select a filename and click the buttons to make it go)

Public Class Form1

Private Sub btnBrowse_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBrowse.Click
    Dim p As New OpenFileDialog With {.CheckFileExists = True, .Multiselect = False}
    Dim pResult = p.ShowDialog()
    If pResult = Windows.Forms.DialogResult.Cancel OrElse pResult = Windows.Forms.DialogResult.Abort Then
        Exit Sub
    End If
    txtFileName.Text = p.FileName
End Sub

Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    FileText = RemoveRTF(FileText)
    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_fixed" & pFile.Extension), FileText)

End Sub


Function RemoveRTF(ByVal rtfText As String)
    Dim rtBox As System.Windows.Forms.RichTextBox = New System.Windows.Forms.RichTextBox

    '// Get the contents of the RTF file. Note that when it is
    '// stored in the string, it is encoded as UTF-16.
    rtBox.Rtf = rtfText
    Dim plainText = rtBox.Text

    Return plainText
End Function


Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    Dim DestFileLine As String = ""
    Dim DestFileText As New System.Text.StringBuilder

    'Need to parse at lines with numbers, lines with all caps are thrown away until next number
    FileText = Strings.Replace(FileText, vbCr, "")
    Dim pFileLines = FileText.Split(vbLf)
    Dim CurCode As String = ""
    For Each pLine In pFileLines
        If pLine.Length = 0 Then
            Continue For
        End If
        pLine = pLine.Replace(ChrW(9), " ")
        pLine = pLine.Trim

        Dim NonCodeLine As Boolean = False
        If IsNumeric(pLine.Substring(0, 1)) OrElse (pLine.Length > 3 AndAlso (pLine.Substring(0, 1) = "E" OrElse pLine.Substring(0, 1) = "V") AndAlso IsNumeric(pLine.Substring(1, 1))) Then
            Dim SpacePos As Int32
            SpacePos = InStr(pLine, " ")
            Dim NewCode As String
            NewCode = ""
            If SpacePos >= 3 Then
                NewCode = Strings.Left(pLine, SpacePos - 1)
            End If

            If SpacePos < 3 OrElse Strings.Mid(pLine, SpacePos - 1, 1) = "." OrElse InStr(NewCode, "-") > 0 Then
                NonCodeLine = True
            Else
                If CurCode <> "" Then
                    DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                    DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                    DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                    CurCode = ""
                    DestFileLine = ""
                End If

                CurCode = NewCode
                DestFileLine = Strings.Mid(pLine, SpacePos + 1)
            End If
        Else
            NonCodeLine = True
        End If


        If NonCodeLine = True AndAlso CurCode <> "" Then 'If we are not on a code keep going, otherwise check it
            Dim pReg As New System.Text.RegularExpressions.Regex("[a-z]")
            Dim pRegCaps As New System.Text.RegularExpressions.Regex("[A-Z]")
            If pReg.IsMatch(pLine) OrElse pLine.Length <= 5 OrElse pRegCaps.IsMatch(pLine) = False OrElse (Strings.Left(pLine, 3) = "NOS" OrElse Strings.Left(pLine, 2) = "IQ") Then
                DestFileLine &= " " & pLine
            Else 'Is all caps word
                DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                CurCode = ""
                DestFileLine = ""
            End If
        End If
    Next

    If CurCode <> "" Then
        DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
        DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
        DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
        CurCode = ""
        DestFileLine = ""
    End If

    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_parsed" & pFile.Extension), DestFileText.ToString)
End Sub

End Class




回答3:


Center for Medicare Services (CMS) is actually charged with ICD, so I think the CDC versions you guys reference may just be copies or reprocessed copies. Here is the (~hard to find) medicare page which i think contains the original raw data ("source of truth").

http://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes.html

It looks like as of this post the latest version is v32. The zip you download will contain 4 plain-text files which map code-to-description (one file for every combination of DIAG|PROC and SHORT|LONG). It also contains two excel files (one each for DIAG_PROC) which have three columns so map code to both descriptions (long and short).




回答4:


Clearly, a very old thread but I recently undertook this task and wrote it up here with links to source data -

http://colinwhite.net/dropplets/ICD

I was trying to get both ICD-9 and ICD-10 into a SQLite database.

Seems to have worked well.




回答5:


You can get the orginal RTF code files from here http://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/



来源:https://stackoverflow.com/questions/3653811/icd-9-code-list-in-xml-csv-or-database-format

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!