问题
I'm trying to crawl the web and store HTML data on MongoDB using Java. Unfortunetely while storing data, MongoDB drivers nulling the data and stores empty field for HTML data.
When I get the first 500 chars of HTML data, I can store/upsert it without a problem so I think something in HTML (or Javascript in it) corrupts the command sent to MongoDB and MongoDB stores empty data instead of HTML. (EDIT: Also I've tried with 40.000 and 50.000 chars and 40.000 was OK but 50.000 char data didn't show up on MongoDB) Should I use something else for storing HTML/JavaScript data?
Here is my code snippet
BasicDBObject savedDoc = new BasicDBObject();
savedDoc.put("url_ID", objURL.get("_id"));
savedDoc.put("cnt", content); //Content field
savedDoc.put("st", 0);
collection.update(new BasicDBObject().append("url_ID", objURL.get("_id")), savedDoc, true, false);
来源:https://stackoverflow.com/questions/12139656/how-to-store-html-data-in-mongodb