Non english characters in database using Java

白昼怎懂夜的黑 提交于 2019-12-24 16:07:55

问题


I have to save non-english (special character) in MySql using Java code , When i am trying to do so data is getting saved as ??????

String dataStr  = "κωνσταντίνα";
            System.out.println("Before " + dataStr);
             String dataStr1 = new String(dataStr.getBytes("ISO-8859-1"),"UTF-8"); 
             System.out.println("after "+dataStr1);
            String st = URLDecoder.decode("κωνσταντίνα", "UTF-8");
            cd.setTransactionDescription(dataStr1);

回答1:


You really should try making everything UTF-8 from point to point.

Use appropriate unicode aware collation for database and table, I always give per table even if db default was already given. This answer has a lot of mysql+java and also servlet issues but they should answer most issues we need to know when developing unicode aware java applications.

CREATE DATABASE mydb DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_swedish_ci;

CREATE TABLE tMyTable (
  id int(11) NOT NULL auto_increment,
  code VARCHAR(20) NOT NULL,
  name VARCHAR(20) NOT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_swedish_ci;

Use jdbc connection string to have unicode translation.

<Resource name="jdbc/mydb" auth="Container" type="javax.sql.DataSource"
  maxActive="10" maxIdle="2" maxWait="10000"
  username="myuid" password="mypwd"
  driverClassName="com.mysql.jdbc.Driver"
  url="jdbc:mysql://localhost:3306/mydb?useUnicode=true&characterEncoding=utf8"
  validationQuery="SELECT 1"
/>

Force Tomcat to use content-type charset for both GET and POST parameter strings, so apply useBodyEncodingForURI attribute for http and https connectors (tomcat/conf/server.xml file).

<Connector port="8080"
           maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
           enableLookups="false" redirectPort="8443" acceptCount="100"
           debug="0" connectionTimeout="20000"
           disableUploadTimeout="true"
           useBodyEncodingForURI="true"
/>

At the start of each servlet page make sure Tomcat parsers request parameters as utf-8. You need to call setCharacterEncoding before reading parameters or it's too late. Most web browsers don't send content-type charset attribute so servlet engines may guess it wrong.

public void doGet(HttpServletRequest req, HttpServletResponse res)
      throws ServletException { doPost(req, res); }

public void doPost(HttpServletRequest req, HttpServletResponse res)
      throws ServletException {
   if (req.getCharacterEncoding() == null)
      req.setCharacterEncoding("UTF-8");

   String value = request.getParameter("fieldName");
   ...
}

Be careful with .jsp page do not insert an empty leading whitechars or it may be too late calling setCharacterEncoding, see how I put tag markers at the end of each row to avoid any whitechars, also how html elements start from the first line. Jsp tag contentType goes to http response and pageEncoding means how file is stored in a disk. If you have ISO-8859-15 text editor only and do not hardcode i18n letters in a jsp page you may choose proper iso* pageEncoding.

<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %><%@ 
    taglib prefix="x" uri="http://java.sun.com/jsp/jstl/xml"  %><%@ 
    page contentType="text/html; charset=UTF-8" pageEncoding="UTF-8"
    import="java.util.*,
             java.io.*
    "
%><%
   if (req.getCharacterEncoding() == null)
      request.setCharacterEncoding("UTF-8");
   String param1 = request.getParameter("fieldName");
%><!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>Page Title</title>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <meta name="keywords" content="some,fine,keywords" />
</head>
<body>
your html content goes here.... <%= param1 %>
</body>
</html>

Creating xml document in jsp page you need to write xml header without leading whitechars or newlines. See how scriptlet endtag and xml header is in a same line. This is what embedded jsp code must always take into consideration, innocent leading whitechar may ruin well formatted replys.

<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %><%@ 
    page contentType="text/xml; charset=UTF-8" pageEncoding="ISO-8859-1"
    import="java.util.*, 
             java.io.*
    "
%><%
  // MyBean has getId() and getName() getters
  List<MyBean> items = new ArrayList<MyBean>();
  items.add( new MyBean(1, "first") );
  items.add( new MyBean(2, "second") );
  items.add( new MyBean(3, "third") );

  pageContext.setAttribute("items", items);
%><?xml version="1.0" encoding="UTF-8"?>
<mydoc>
<c:forEach var="item" items="${items}">
  <item>
    <id>${item.id}</id>
    <name>${item.name}</name>
  </item>
</c:forEach>
</mydoc>



回答2:


Reasone because this happens is wrong encoding in JavaClass. Also i advice you to check your MySQL database encoding.

[mysqld]
character-set-server = utf8
collation-server = utf8_unicode_ci

Check this db-serverSide params

character_set_results 
character_set_connection
character_set_client 


来源:https://stackoverflow.com/questions/33669560/non-english-characters-in-database-using-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!