regExp for matching directories

本小妞迷上赌 提交于 2020-01-04 02:51:09

问题


I have a somewhat complex directory structure for NetCDF files I want to create a THREDDS catalog for.

/data/buoy/A0121/realtime/A0121.met.realtime.nc
                         /A0121.waves.realtime.nc
                         etc.
/data/buoy/A0122/realtime/A0122.met.realtime.nc
                         /A0122.sbe37.realtime.nc
                         etc.
/data/buoy/B0122/realtime/B0122.met.realtime.nc
                         /B0122.sbe37.realtime.nc
etc.

But I have found that the regExp attribute in both datasetScan and aggregation/scan elements does not seem to be able to handle subdirectories using regExp. For example this catalog entry works.

<datasetScan name="All TEST REALTIME" ID="all_test_realtime" path="/All/Realtime"
   location="/data/buoy/B0122" >
  <metadata inherited="true">
    <serviceName>all</serviceName>
  </metadata>
  <filter>
    <include regExp="realtime" atomic="false" collection="true" />
    <include wildcard="*.nc" />
    <!-- exclude directory -->
    <exclude wildcard="old" atomic="false" collection="true" />
  </filter>
</datasetScan>

But the following does not. No datasets are found.

<datasetScan name="All TEST REALTIME" ID="all_test_realtime" path="/All/Realtime" 
  location="/data/buoy" >
  <metadata inherited="true">
    <serviceName>all</serviceName>
  </metadata>
  <filter>
    <include regExp="B0122/realtime" atomic="false" collection="true" />
    <include wildcard="*.nc" />
    <!-- exclude directory -->
    <exclude wildcard="old" atomic="false" collection="true" />
  </filter>
</datasetScan>

This is a greatly simplified example done just to confirm that regExp does not match subdirectories which is implied at the bottom of this ncML page. http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/ncml/v2.2/AnnotatedSchema4.html

My real goal is to use ncML aggregation via <scan regExp="">

Should I be using FeatureCollections? These are pretty simple time series buoy observation files.


回答1:


If you are scanning files for an <aggregation> and you want to include subdirectories, you can add subdirs="true" inside the <scan> element, for example:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
    <aggregation dimName="ocean_time" type="joinExisting">
        <scan location="." regExp=".*vs_his_[0-9]{4}\.nc$" subdirs="true"/>        
    </aggregation>
</netcdf>

For datasetScan datasets, the regexp filter will automatically apply to all subdirectories, so if you wanted to apply those filters to all subdirectories, you could just do:

<datasetScan name="All TEST REALTIME" ID="all_test_realtime" path="/All/Realtime" 
  location="/data/buoy" >
  <metadata inherited="true">
    <serviceName>all</serviceName>
  </metadata>
  <filter>
    <include regExp="realtime" atomic="false" collection="true" />
    <include wildcard="*.nc" />
    <!-- exclude directory -->
    <exclude wildcard="old" atomic="false" collection="true" />
  </filter>
</datasetScan>



回答2:


<filter>
  <include regExp="[A-Z]{1}[0-9]{4}" atomic="false" collection="true" />
  <include wildcard="realtime" atomic="false" collection="true" />
  <include wildcard="post-recovery" atomic="false" collection="true" />
  <include wildcard="*.nc" />
  <!-- exclude directory -->
  <exclude wildcard="old" atomic="false" collection="true" />
</filter>


来源:https://stackoverflow.com/questions/19385287/regexp-for-matching-directories

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!