Search files recursively using google drive rest

后端 未结 2 382
一整个雨季
一整个雨季 2021-01-13 16:40

I am trying to grab all the files created under a parent directory. The parent directory has a lot of sub directories followed by files in those directories.



        
相关标签:
2条回答
  • 2021-01-13 17:07

    Here is an answer to your question.

    Same idea from your scenario:

    folderA____ folderA1____folderA1a
           \____folderA2____folderA2a
                        \___folderA2b
    

    There 3 alternative answers that I think you can get an idea from.

    Alternative 1. Recursion

    The temptation would be to list the children of folderA, for any children that are folders, recursively list their children, rinse, repeat. In a very small number of cases, this might be the best approach, but for most, it has the following problems:-

    • It is woefully time consuming to do a server round trip for each sub folder. This does of course depend on the size of your tree, so if you can guarantee that your tree size is small, it could be OK.

    Alternative 2. The common parent

    This works best if all of the files are being created by your app (ie. you are using drive.file scope). As well as the folder hierarchy above, create a dummy parent folder called say "MyAppCommonParent". As you create each file as a child of its particular Folder, you also make it a child of MyAppCommonParent. This becomes a lot more intuitive if you remember to think of Folders as labels. You can now easily retrieve all descdendants by simply querying MyAppCommonParent in parents.

    Alternative 3. Folders first

    Start by getting all folders. Yep, all of them. Once you have them all in memory, you can crawl through their parents properties and build your tree structure and list of Folder IDs. You can then do a single files.list?q='folderA' in parents or 'folderA1' in parents or 'folderA1a' in parents.... Using this technique you can get everything in two http calls.

    Alternative 2 is the most effificient, but only works if you have control of file creation. Alternative 3 is generally more efficient than Alternative 1, but there may be certain small tree sizes where 1 is best.

    0 讨论(0)
  • 2021-01-13 17:14
    scope = ['https://www.googleapis.com/auth/drive']
    
    credentials = ServiceAccountCredentials.from_json_keyfile_name('your JSON credentials' % path, scope)
    
    service = build('drive', 'v3', credentials=credentials)
    
    folder_tree = "NAME OF THE FOLDER YOU WANT TO START YOUR SEARCH"
    folder_ids = {}
    folder_ids['NAME OF THE FOLDER YOU WANT TO START YOUR SEARCH'] = folder_id
    
    def check_for_subfolders(folder_id):
        new_sub_patterns = {}
        folders = service.files().list(q="mimeType='application/vnd.google-apps.folder' and parents in '"+folder_id+"' and trashed = false",fields="nextPageToken, files(id, name)",pageSize=400).execute()
        all_folders = folders.get('files', [])
        all_files = check_for_files(folder_id)
        n_files = len(all_files)
        n_folders = len(all_folders)
        old_folder_tree = folder_tree
        if n_folders != 0:
            for i,folder in enumerate(all_folders):
                folder_name =  folder['name']
                subfolder_pattern = old_folder_tree + '/'+ folder_name
                new_pattern = subfolder_pattern
                new_sub_patterns[subfolder_pattern] = folder['id']
                print('New Pattern:', new_pattern)
                all_files = check_for_files(folder['id'])
                n_files =len(all_files)
                new_folder_tree = new_pattern 
                if n_files != 0:
                    for file in all_files:
                        file_name = file['name']
                        new_file_tree_pattern = subfolder_pattern + "/" + file_name
                        new_sub_patterns[new_file_tree_pattern] = file['id']
                        print("Files added :", file_name)
                else:
                    print('No Files Found')
        else:
            all_files = check_for_files(folder_id)
            n_files = len(all_files)
            if n_files != 0:
                for file in all_files:
                    file_name = file['name']
                    subfolders[folder_tree + '/'+file_name] = file['id']
                    new_file_tree_pattern = subfolder_pattern + "/" + file_name
                    new_sub_patterns[new_file_tree_pattern] = file['id']
                    print("Files added :", file_name)
        return new_sub_patterns 
    
    def check_for_files(folder_id):
        other_files = service.files().list(q="mimeType!='application/vnd.google-apps.folder' and parents in '"+folder_id+"' and trashed = false",fields="nextPageToken, files(id, name)",pageSize=400).execute()
        all_other_files = other_files.get('files', [])   
        return all_other_files
    def get_folder_tree(folder_id):
        global folder_tree
        sub_folders = check_for_subfolders(folder_id)
    
        for i,sub_folder_id in enumerate(sub_folders.values()):
            folder_tree = list(sub_folders.keys() )[i]
            print('Current Folder Tree : ', folder_tree)
            folder_ids.update(sub_folders)
            print('****************************************Recursive Search Begins**********************************************')
            try:
                get_folder_tree(sub_folder_id)
            except:
                print('---------------------------------No furtherance----------------------------------------------')
        return folder_ids 
    
    folder_ids = get_folder_tree(folder_id)
    
    0 讨论(0)
提交回复
热议问题