How to store HTML data in MongoDB?

人盡茶涼 提交于 2019-12-13 12:31:45

问题


I'm trying to crawl the web and store HTML data on MongoDB using Java. Unfortunetely while storing data, MongoDB drivers nulling the data and stores empty field for HTML data.

When I get the first 500 chars of HTML data, I can store/upsert it without a problem so I think something in HTML (or Javascript in it) corrupts the command sent to MongoDB and MongoDB stores empty data instead of HTML. (EDIT: Also I've tried with 40.000 and 50.000 chars and 40.000 was OK but 50.000 char data didn't show up on MongoDB) Should I use something else for storing HTML/JavaScript data?

Here is my code snippet

BasicDBObject savedDoc = new BasicDBObject();
savedDoc.put("url_ID", objURL.get("_id"));
savedDoc.put("cnt", content); //Content field
savedDoc.put("st", 0);
collection.update(new BasicDBObject().append("url_ID", objURL.get("_id")), savedDoc, true, false);

来源:https://stackoverflow.com/questions/12139656/how-to-store-html-data-in-mongodb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!