So my problem is that a have a few files not showing up in gcsfuse when mounted. I see them in the online console and if I \'ls\' with gsutils. Also, if If I manually create th
@Brandon Yarbrough suggests creating needed directory entries in the GCS bucket. This avoids the performance penalty described by @jacobsa.
Here is a bash
script for doing so:
# 1. Mount $BUCKET_NAME at $MOUNT_PT
# 2. Run this script
MOUNT_PT=${1:-HOME/mnt}
BUCKET_NAME=$2
DEL_OUTFILE=${3:-y} # Set to y or n
echo "Reading objects in $BUCKET_NAME"
OUTFILE=dir_names.txt
gsutil ls -r gs://$BUCKET_NAME/** | while read BUCKET_OBJ
do
dirname "$BUCKET_OBJ"
done | sort -u > $OUTFILE
echo "Processing directories found"
cat $OUTFILE | while read DIR_NAME
do
LOCAL_DIR=`echo "$DIR_NAME" | sed "s=gs://$BUCKET_NAME/==" | sed "s=gs://$BUCKET_NAME=="`
#echo $LOCAL_DIR
TARG_DIR="$MOUNT_PT/$LOCAL_DIR"
if ! [ -d "$TARG_DIR" ]
then
echo "Creating $TARG_DIR"
mkdir -p "$TARG_DIR"
fi
done
if [ $DEL_OUTFILE = "y" ]
then
rm $OUTFILE
fi
echo "Process complete"
I wrote this script, and have shared it at https://github.com/mherzog01/util/blob/main/sh/mk_bucket_dirs.sh.
This script assumes that you have mounted a GCS bucket locally on a Linux (or similar) system. The script first specifies the GCS bucket and location where the bucket is mounted. It then identifies all "directories" in the GCS bucket which are not visible locally, and creates them.
This (for me) fixed the issue with folders (and associated objects) not showing up in the mounted folder structure.
Google Cloud Storage doesn't have folders. The various interfaces use different tricks to pretend that folders exist, but ultimately there's just an object whose name contains a bunch of slashes. For example, "pictures/january/0001.jpg" is the full name of a single object.
If you need to be sure that a "folder" exists, put an object inside it.
By default, gcsfuse won't show a directory "implicitly" defined by a file with a slash in its name. For example if your bucket contains an object named dir/foo.txt
, you won't be able to find it unless there is also an object nameddir/
.
You can work around this by setting the --implicit-dirs
flag, but there are good reasons why this is not the default. See the documentation for more information.