问题
I have a very simple application. A user uploads a pdf file to a postgres database via the web front end. That pdf should then be rendered in the browser via pdfjs.
I'm fairly certain my issue is an encoding one, but I don't think I understand encoding well enough to answer this on my own.
My model:
class Lesson(Base):
__tablename__ = 'lessons'
# Name of the lesson
lesson_order = db.Column(db.Enum(LessonIndexes), nullable=False)
name = db.Column(db.String(128), nullable=False)
summary = db.Column(db.String(500))
lesson_plan_id = db.Column(db.Integer(), ForeignKey('lesson_plans.id'), nullable=False)
pdf = db.Column(db.LargeBinary())
My Controller:
@mod_lp.route('/<lesson_plan_id>/create_lesson', methods=["POST"])
def create_lesson(lesson_plan_id):
form = LessonForm()
file = request.files['pdf'] # type: FileStorage
if form.validate_on_submit():
file = request.files['pdf']
lesson = Lesson(form.lesson_order.data, form.name.data, form.summary.data, lesson_plan_id,
pdf=file.read() # this line here
)
db.session.add(lesson)
db.session.commit()
return redirect(url_for('lesson_plan.show', lesson_plan_id=lesson_plan_id))
This stores the data to look something like:
%PDF-1.4
%����
1 0 obj
<</Creator (Mozilla/5.0 \(Macintosh; Intel Mac OS X 10_12_6\) AppleWebKit/537.36 \(KHTML, like Gecko\) Chrome/60.0.3112.113 Safari/537.36)
/Producer (Skia/PDF m60)
/CreationDate (D:20170916222407+00'00')
/ModDate (D:20170916222407+00'00')>>
endobj
2 0 obj
<</Filter /FlateDecode
/Length 1370>> stream
x���ݎ�4��<������� qq$8�@%`aB�H�_�����T�E���ړ�c'�t�Z��[������}�{�I���@���
(etc...)
my javasript (taken from PDFJS, hello world):
var pdfString = "{{ pdf_data}}";
var pdfData = atob(pdfString);
if (pdfData) {
var loadingTask = PDFJS.getDocument({data: pdfData});
loadingTask.promise.then(function (pdf) {
console.log('PDF loaded');
// Fetch the first page
var pageNumber = 1;
pdf.getPage(pageNumber).then(function (page) {
console.log('Page loaded');
var scale = 1.5;
var viewport = page.getViewport(scale);
// Prepare canvas using PDF page dimensions
var canvas = document.getElementById('pdf-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
renderTask.then(function () {
console.log('Page rendered');
});
});
}, function (reason) {
// PDF loading error
console.error(reason);
});
The current error I have is:
6:108 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
things i've tried:
file.stream.getvalue()
file.stream.getvalue().decode("latin-1") # for whatever reason, this was the only 'decode' that didn't throw an error
file.stream.getvalue().decode("latin-1").encode()
base64.b64encode(file.stream.getvalue().decode("latin-1").encode())
but these all failed in various ways. UPDATE:
If I send the binary data in the database to my template:
pdf_data = lesson.pdf
and forget about calling atob
on it:
var pdfData = pdfString;
if (pdfData) {
...
I get this error:
Error: Invalid XRef stream header
pdf.worker.js:340 at error (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:340:17)
at XRef_readXRef [as readXRef] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20943:13)
at XRef_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20613:28)
at PDFDocument_setup [as setup] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26445:17)
at PDFDocument_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26336:12)
at http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36120:28
at Promise (<anonymous>)
at LocalPdfManager_ensure [as ensure] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36115:14)
at LocalPdfManager.BasePdfManager_ensureDoc [as ensureDoc] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36067:19)
回答1:
atob expects a base64 encoded string. I got a basic example to at least get a successful call to atob. Pretty sure this is the issue that you are seeing though. You could probably just save the base64 encoded content in that postgres table so that you don't need to decode it all of the time. The 'source.pdf' is just a sample pdf I had on disk. However you can swap this in with data from your postgres table.
flask_app.py
from flask import Flask, request, render_template
import base64
app = Flask(__name__)
@app.route("/testing", methods=["GET"])
def get_test_file():
with open("source.pdf", "rb") as data_file:
data = data_file.read()
encoded_data = base64.b64encode(data).decode('utf-8')
return render_template("test.html", encoded_data=encoded_data)
test.html
<html>
<head>
</head>
<body>
<script>
var encoded_data = '{{ encoded_data }}';
var pdf_data = atob(encoded_data);
</script>
</body>
</html>
来源:https://stackoverflow.com/questions/46265079/flask-postgres-display-pdf-with-pdfjs