I\'m in the process of parameterizing my bokeh apps by having my Flask app expose model data via a route dedicated to jsonifying the requested data passed via query string a
This appears to be not an issue with Bokeh per se but rather an issue with threading and blocking in the server that's running the Flask app.
It's reproducible apart from Bokeh entirely...
import requests
from flask import Flask, jsonify, request
import pandas
import pdb
flask_app = Flask(__name__)
# Populate some model maintained by the flask application
modelDf = pandas.DataFrame()
nData = 100
modelDf[ 'c1_x' ] = range(nData)
modelDf[ 'c1_y' ] = [ x*x for x in range(nData) ]
modelDf[ 'c2_x' ] = range(nData)
modelDf[ 'c2_y' ] = [ 2*x for x in range(nData) ]
@flask_app.route('/', methods=['GET'] )
def index():
res = "<table>"
res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c1\">SEND C1</a></td></tr>"
res += "<tr><td><a href=\"http://localhost:8080/sendModelData/c2\">SEND C2</a></td></tr>"
res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlaskNoProxy?colName=c1\">REQUEST OVER FLASK NO PROXY C1</a></td></tr>"
res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlaskNoProxy?colName=c2\">REQUEST OVER FLASK NO PROXY C2</a></td></tr>"
res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlask?colName=c1\">REQUEST OVER FLASK C1</a></td></tr>"
res += "<tr><td><a href=\"http://localhost:8080/RequestsOverFlask?colName=c2\">REQUEST OVER FLASK C2</a></td></tr>"
res += "</table>"
return res
@flask_app.route('/RequestsOverFlaskNoProxy')
def requestsOverFlaskNoProxy() :
print("RequestsOverFlaskNoProxy")
# get column name from query string
colName = request.args.get('colName')
# get model data from Flask
url = "http://localhost:8080/sendModelData/%s" % colName
print("Get data from %s" % url )
session = requests.Session()
session.trust_env = False
res = session.get( url , timeout=5000 , verify=False )
print( "CODE %s" % res.status_code )
print( "ENCODING %s" % res.encoding )
print( "TEXT %s" % res.text )
data = res.json()
return data
@flask_app.route('/RequestsOverFlask')
def requestsOverFlask() :
# get column name from query string
colName = request.args.get('colName')
# get model data from Flask
url = "http://localhost:8080/sendModelData/%s" % colName
res = requests.get( url , timeout=None , verify=False )
print( "CODE %s" % res.status_code )
print( "ENCODING %s" % res.encoding )
print( "TEXT %s" % res.text )
data = res.json()
return data
@flask_app.route('/sendModelData/<colName>' , methods=['GET'] )
def sendModelData( colName ) :
x = modelDf[ colName + "_x" ].tolist()
y = modelDf[ colName + "_y" ].tolist()
return jsonify( x=x , y=y )
if __name__ == '__main__':
print('Opening Flask app on http://localhost:8080/')
# THIS DOES NOT WORK
#flask_app.run( host='0.0.0.0' , port=8080 , debug=True )
# THIS WORKS
flask_app.run( host='0.0.0.0' , port=8080 , debug=True , threaded=True )
One can see from the screen shot that serving data directly from sendModelData
renders the JSon appropriately, but when fetched via the requests.get
method yields an exception due to a 503 code as reported in the Python Console.
If I make the same attempt trying to eliminate the effect of the proxies which I have enabled via environment variables but this approach never completes and the request leaves the browser spinning indefinitely.
Come to think of it it may be completely unnecessary to even use requests as a middle man and I should be able to just get the json string and go about deserializing it myself. Well, that would work in this setup by in my actual code the Bokeh rendering is done in a completely different python Module than the Flask application so these functions are not even available unless I scramble the layering of the app.
EDIT As it turns out the fundamental thing I was violating was with Flask's development environment...
You are running your WSGI app with the Flask test server, which by default uses a single thread to handle requests. So when your one request thread tries to call back into the same server, it is still busy trying to handle that one request. https://stackoverflow.com/a/22878916/1330381
So then the question becomes how to apply this threaded=True technique in the original Bokeh example? This may not be possible by the flask_embed.py example's reliance on the Tornado WSGI server which from this question suggests Tornado is single threaded by design.
Given the above findings an even keener question is how does the AjaxDataSource
all together avoid these threading issues faced by the requests
module?
Update Some more background on the Bokeh and Tornado coupling...
53:05 so they're actually are not very many, the question is about the dependencies for Bokeh and the Bokeh server. The new Bokeh server is built on tornado and that's pretty much the main dependency is it uses tornado. Beyond that there's not very many dependencies, runtime dependencies, for Bokeh. pandas is an optional dependency for Bokeh.charts. There's other dependencies, you know numpy is used. But there's only, the list of dependencies I think is six or seven. We've tried to pare it down greatly over the years and so, but the main dependency of the server is tornado. Intro to Data Visualization with Bokeh - Part 1 - Strata Hadoop San Jose 2016