excel formula find part number in file path text string

坚强是说给别人听的谎言 提交于 2019-12-25 01:31:03

问题


I have a extract of all the files on a network drive, and in the some file names is a part number, the part numbers format is 0000-000000-00. Now in the 600,000+ path names in this file I'm trying to figure out how to extract my part numbers out of the path names. I think a mid formula might work but I am at a loss on how to tell it to find anything with the part # format 0000-000000-00 and extract only those 14 characters from the path?

input looks like this

c:\users\stuff\folder_name\1234-000001-01_ baskets_1.pdf
c:\users\stuff\folder_name\1234-000001-02_ baskets_2.pdf
c:\users\stuff\folder_name\1234-000001-03_ baskets_3.pdf
c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf
c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf
c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf

output I'm hoping for

1234-000001-01
1234-000001-02
1234-000001-03
1234-000030-01

回答1:


Since you have a pattern we can exploit, use this:

=MID(A1,SEARCH("????-??????-??",A1),14)

Finds the start of the pattern and returns the 14 character after.




回答2:


You wanted a formula but a UDF could also be used to apply a regex to get the pattern (a little overkill in this instance but worth being aware of):

Option Explicit
Public Sub GetCustomString()
    Dim i As Long, tests()
    tests = Array("c:\users\stuff\folder_name\1234-000001-01_ baskets_1.pdf", _
    "c:\users\stuff\folder_name\1234-000001-02_ baskets_2.pdf", _
    "c:\users\stuff\folder_name\1234-000001-03_ baskets_3.pdf", _
    "c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf", _
    "c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf", _
    "c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf")

    For i = LBound(tests) To UBound(tests)
        Debug.Print GetString(tests(i))
    Next
End Sub

Public Function GetString(ByVal inputString As String) As String
    Dim arr() As String, i As Long, matches As Object, re As Object
    Set re = CreateObject("VBScript.RegExp")
    With re
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "\d{4}-\d{6}-\d{2}"
        If .test(inputString) Then
            GetString = .Execute(inputString)(0)
        Else
            GetString = vbNullString
        End If
    End With
End Function

Using UDF in sheet:


Pattern: \d{4}-\d{6}-\d{2}

Explanation:

\d{4} matches a digit (equal to [0-9])

{4} Quantifier — Matches exactly 4 times

"-" matches the character - literally (case sensitive)

\d{6} matches a digit (equal to [0-9])

{6} Quantifier — Matches exactly 6 times

"-" matches the character - literally (case sensitive)

\d{2} matches a digit (equal to [0-9])

{2} Quantifier — Matches exactly 2 times

Global pattern flags: g modifier: global. All matches (don't return after first match) m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)



来源:https://stackoverflow.com/questions/51792735/excel-formula-find-part-number-in-file-path-text-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!