Python file cache

后端 未结 3 848
鱼传尺愫
鱼传尺愫 2021-01-03 00:15

I\'m creating some objects from files (validators from templates xsd files, to draw together other xsd files, as it happens), and I\'d like to recreate the objects when the

相关标签:
3条回答
  • 2021-01-03 00:29

    Unless there is a specific reason to use it as argument I would use cache as a global object

    0 讨论(0)
  • 2021-01-03 00:35

    Three thoughts.

    1. Use try... except... else for a neater control flow.

    2. File modification times are notoriously unstable -- in particular, they don't necessarily correspond to the most recent time the file was modified!

    3. Python 3 contains a caching decorator: functools.lru_cache. Here's the source.

      def lru_cache(maxsize=100):
          """Least-recently-used cache decorator.
      
          If *maxsize* is set to None, the LRU features are disabled and the cache
          can grow without bound.
      
          Arguments to the cached function must be hashable.
      
          View the cache statistics named tuple (hits, misses, maxsize, currsize) with
          f.cache_info().  Clear the cache and statistics with f.cache_clear().
          Access the underlying function with f.__wrapped__.
      
          See:  http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used
      
          """
          # Users should only access the lru_cache through its public API:
          #       cache_info, cache_clear, and f.__wrapped__
          # The internals of the lru_cache are encapsulated for thread safety and
          # to allow the implementation to change (including a possible C version).
      
          def decorating_function(user_function,
                      tuple=tuple, sorted=sorted, len=len, KeyError=KeyError):
      
              hits = misses = 0
              kwd_mark = (object(),)          # separates positional and keyword args
              lock = Lock()                   # needed because ordereddicts aren't threadsafe
      
              if maxsize is None:
                  cache = dict()              # simple cache without ordering or size limit
      
                  @wraps(user_function)
                  def wrapper(*args, **kwds):
                      nonlocal hits, misses
                      key = args
                      if kwds:
                          key += kwd_mark + tuple(sorted(kwds.items()))
                      try:
                          result = cache[key]
                          hits += 1
                      except KeyError:
                          result = user_function(*args, **kwds)
                          cache[key] = result
                          misses += 1
                      return result
              else:
                  cache = OrderedDict()       # ordered least recent to most recent
                  cache_popitem = cache.popitem
                  cache_renew = cache.move_to_end
      
                  @wraps(user_function)
                  def wrapper(*args, **kwds):
                      nonlocal hits, misses
                      key = args
                      if kwds:
                          key += kwd_mark + tuple(sorted(kwds.items()))
                      try:
                          with lock:
                              result = cache[key]
                              cache_renew(key)        # record recent use of this key
                              hits += 1
                      except KeyError:
                          result = user_function(*args, **kwds)
                          with lock:
                              cache[key] = result     # record recent use of this key
                              misses += 1
                              if len(cache) > maxsize:
                                  cache_popitem(0)    # purge least recently used cache entry
                      return result
      
              def cache_info():
                  """Report cache statistics"""
                  with lock:
                      return _CacheInfo(hits, misses, maxsize, len(cache))
      
              def cache_clear():
                  """Clear the cache and cache statistics"""
                  nonlocal hits, misses
                  with lock:
                      cache.clear()
                      hits = misses = 0
      
              wrapper.cache_info = cache_info
              wrapper.cache_clear = cache_clear
              return wrapper
      
          return decorating_function
      
    0 讨论(0)
  • 2021-01-03 00:39

    Your code (including the cache logic) looks fine.

    Consider moving the cache variable outside the function definition. That will make it possible to add other functions to clear or inspect the cache.

    If you want to look at code that does something similar, look at the source for the filecmp module: http://hg.python.org/cpython/file/2.7/Lib/filecmp.py The interesting part is how the stat module is used to determine whether a file has changed. Here is the signature function:

    def _sig(st):
        return (stat.S_IFMT(st.st_mode),
                st.st_size,
                st.st_mtime)
    
    0 讨论(0)
提交回复
热议问题