问题
I'm writing a stateful server in Clojure backed by Neo4j that can serve socket requests, like HTTP. Which means, of course, that I need to be able to start and stop socket servers from within this server. Design-wise, I would want to be able to declare a "service" within this server and start and stop it.
What I'm trying to wrap my mind around in Clojure is how to ensure that starting and stopping these services is thread-safe. This server I'm writing will have NREPL embedded inside it and process incoming requests in a parallel way. Some of these requests will be administrative: start service X, stop service Y. Which opens up the possibility that two start requests come in at the same time.
- Starting should synchronously check a "running" flag and a "starting" flag and fail if either are set. In the same transaction, the "starting" flag should be set.
- After the "starting" flag is set, the transaction closes. That makes the "starting" flag visible to other transactions.
- Then the (start) function actually starts the service.
- If (start) succeeds, the "running" and "starting" flags are synchronously set.
- If (start) fails, the "starting" flag is set and the exception is returned.
Stopping needs the same thing, checking a "running" flag and checking and setting it's own "stopping" flag.
I'm trying to reason through all possible combinations of (start) and (stop).
Have I missed anything?
Is there a library for this already? If not, what should a library like this look like? I'll open source it and put it on Github.
Edit:
This is what I have so far. There's a hole I can see though. What am I missing?
(ns extenium.db
(:require [clojure.tools.logging :as log])
(:import org.neo4j.graphdb.factory.GraphDatabaseFactory))
(def ^:private
db- (ref {:ref nil
:running false
:starting false
:stopping false}))
(defn stop []
(dosync
(if (or (not (:running (ensure db-)))
(:stopping (ensure db-)))
(throw (IllegalStateException. "Database already stopped or stopping."))
(alter db- assoc :stopping true)))
(try
(log/info "Stopping database")
(.shutdown (:ref db-))
(dosync
(alter db- assoc :ref nil))
(log/info "Stopped database")
(finally
(dosync
(alter db- assoc :stopping false)))))
In the try block, I log, then call .shutdown, then log again. If the first log fails (I/O exceptions can happen), then (:stopping db-) is set to false, which unblocks it and is fine. .shutdown is a void function from Neo4j, so I don't have to evaluate a return value. If it fails, (:stopping db-) is set to false, so that's fine too. Then I set the (:ref db-) to nil. What if that fails? (:stopping db-) is set to false, but the (:ref db-) is left hanging. So that's a hole. Same case with the second log call. What am I missing?
Would this be better if I just used Clojure's locking primitives instead of a ref dance?
回答1:
This is actually a natural fit for a simple lock:
(locking x
(do-stuff))
Here x
is the object on which to synchronize.
To elaborate: starting and stopping a service is a side effect; side effects should not be initiated from inside a transaction, except possibly as Agent actions. Here though locks are exactly what the design calls for. Note that there's nothing wrong in using them in Clojure when they are a good fit for the problem at hand, in fact I would say locking
is the canonical solution here. (See Stuart Halloway's Lancet, introduced in Programming Clojure (1st ed.), for an example of a Clojure library using locks which has seen some widespread use, mostly in Leiningen.)
Update: Adding fail-fast behaviour:
This is still a good fit for a lock, namely a java.util.concurrent.locks.ReentrantLock (follow link for Javadoc):
(import java.util.concurrent.locks.ReentrantLock)
(def lock (ReentrantLock.))
(defn start []
(if (.tryLock lock)
(try
(do-stuff)
(finally (.unlock lock)))
(do-other-stuff)))
(do-stuff)
will be executed if lock acquisition succeeds; otherwise, (do-other-stuff)
will happen. Current thread will not block in either case.
回答2:
This sounds like a good use case for agents, they allow you to serialize changes to a piece of mutable state, the Clojure Agents documentation has a good overview. You can use the error handler and agent-error methods to handle exceptions and never need to worry about locks or race conditions.
(def service (agent {:status :stopped}))
(defn start-service [{:keys [status] :as curr}]
(if (= :stopped status)
(do
(println "starting service")
{:status :started})
(do
(println "service already running")
curr)))
;; start the service like this
(send-off service start-service)
;; gets the current status of the service
@service
来源:https://stackoverflow.com/questions/16131420/canonical-way-to-ensure-only-one-instance-of-a-service-is-running-starting-s