问题
I have a somewhat complex directory structure for NetCDF files I want to create a THREDDS catalog for.
/data/buoy/A0121/realtime/A0121.met.realtime.nc
/A0121.waves.realtime.nc
etc.
/data/buoy/A0122/realtime/A0122.met.realtime.nc
/A0122.sbe37.realtime.nc
etc.
/data/buoy/B0122/realtime/B0122.met.realtime.nc
/B0122.sbe37.realtime.nc
etc.
But I have found that the regExp attribute in both datasetScan and aggregation/scan elements does not seem to be able to handle subdirectories using regExp. For example this catalog entry works.
<datasetScan name="All TEST REALTIME" ID="all_test_realtime" path="/All/Realtime"
location="/data/buoy/B0122" >
<metadata inherited="true">
<serviceName>all</serviceName>
</metadata>
<filter>
<include regExp="realtime" atomic="false" collection="true" />
<include wildcard="*.nc" />
<!-- exclude directory -->
<exclude wildcard="old" atomic="false" collection="true" />
</filter>
</datasetScan>
But the following does not. No datasets are found.
<datasetScan name="All TEST REALTIME" ID="all_test_realtime" path="/All/Realtime"
location="/data/buoy" >
<metadata inherited="true">
<serviceName>all</serviceName>
</metadata>
<filter>
<include regExp="B0122/realtime" atomic="false" collection="true" />
<include wildcard="*.nc" />
<!-- exclude directory -->
<exclude wildcard="old" atomic="false" collection="true" />
</filter>
</datasetScan>
This is a greatly simplified example done just to confirm that regExp does not match subdirectories which is implied at the bottom of this ncML page. http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/ncml/v2.2/AnnotatedSchema4.html
My real goal is to use ncML aggregation via <scan regExp="">
Should I be using FeatureCollections? These are pretty simple time series buoy observation files.
回答1:
If you are scanning files for an <aggregation>
and you want to include subdirectories, you can add subdirs="true"
inside the <scan>
element, for example:
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
<aggregation dimName="ocean_time" type="joinExisting">
<scan location="." regExp=".*vs_his_[0-9]{4}\.nc$" subdirs="true"/>
</aggregation>
</netcdf>
For datasetScan
datasets, the regexp filter will automatically apply to all subdirectories, so if you wanted to apply those filters to all subdirectories, you could just do:
<datasetScan name="All TEST REALTIME" ID="all_test_realtime" path="/All/Realtime"
location="/data/buoy" >
<metadata inherited="true">
<serviceName>all</serviceName>
</metadata>
<filter>
<include regExp="realtime" atomic="false" collection="true" />
<include wildcard="*.nc" />
<!-- exclude directory -->
<exclude wildcard="old" atomic="false" collection="true" />
</filter>
</datasetScan>
回答2:
<filter>
<include regExp="[A-Z]{1}[0-9]{4}" atomic="false" collection="true" />
<include wildcard="realtime" atomic="false" collection="true" />
<include wildcard="post-recovery" atomic="false" collection="true" />
<include wildcard="*.nc" />
<!-- exclude directory -->
<exclude wildcard="old" atomic="false" collection="true" />
</filter>
来源:https://stackoverflow.com/questions/19385287/regexp-for-matching-directories