Home > Encoding > encoding in Jsp/Servlet with utf-8

encoding in Jsp/Servlet with utf-8

Unfortunately UTF-8 is not the default encoding used to application server, browser, editor… other way this article was useless, so to let our web application to send a request with UTF-8 to and application server can require to take care encoding configuration in different point.

manage the encoding in the get

URL encoding  must not to be confused with character encoding. URL encoding is a conversion of characters to their numeral representations in the %xx format, so that special characters can be passed through URL without any problems. The client will URL-encode the characters before sending them to the server. The server should URL-decode the characters using the same character encoding. Also see percent encoding.  Some application server  use the ISO 8859-1 character encoding to URL-decode the request parameters. You need to force the character encoding to UTF-8 yourself.

How to configure this depends on the server used, so  refer its documentation, but  some example:

Tomcat need to set the URIEncoding attribute of the  element in Tomcat’s /conf/server.xml to set the character encoding of HTTP get requests If not specified, ISO-8859-1 will be used,  see  tomcat documentation hire:

<Connector (...) URIEncoding="UTF-8" />

Glassfish need to set the <parameter-encoding> entry in webapp’s /WEB-INF/sun-web.xml (or, since Glassfish 3.1, glassfish-web.xml), see also this document:

<parameter-encoding default-charset="UTF-8" />

Jboss 7.1.0 or higher need to set two properties in the  standalone.xml

<?xml version='1.0' encoding='UTF-8'?>
<server xmlns="urn:jboss:domain:1.1">
    <extensions>
        <extension module="org.jboss.as.clustering.infinispan"/>
        .................
        .................
    </extensions>

<system-properties/>
     <property name="org.apache.catalina.connector.URI_ENCODING" value="UTF-8"/>
     <property name="org.apache.catalina.connector.USE_BODY_ENCODING_FOR_QUERY_STRING" value="true"/>
</system-properties/>

WebSphere 7.x or higher need to set a system properties direclty in the virtual machine, this is the procedure that must be followed:

1. On the Application Server page, click on the name of the server you want enabled for UTF-8.
2. On the settings page for the selected application server, click Process Definition.
3. On the Process Definition page, click Java Virtual Machine.
4. On the Java Virtual Machine page, specify “-Dclient.encoding.override=UTF-8” for Generic JVM Arguments and click OK.

hire a snippet that dimostrate this case

package test;

import java.net.URLDecoder;
import java.net.URLEncoder;

public class Test {

    public static void main(String... args) throws Exception {
       String input = "عربي";
       System.out.println("Original input string from client: " + input);

       String encoded = URLEncoder.encode(input, "UTF-8");
       System.out.println("URL-encoded by client with UTF-8: " + encoded);

       String incorrectDecoded = URLDecoder.decode(encoded, "ISO-8859-1");
       System.out.println("URL-decoded with ISO-8859-1: " + incorrectDecoded);

       String correctDecoded = URLDecoder.decode(encoded, "UTF-8");
       System.out.println("URL-decoded with UTF-8: " + correctDecoded);
  }
}

manage the encoding in the post

The browser should send the charset used in the Content-Type request header. However, most webbrowsers doesn’t do it. Those Browsers will just use the same character encoding as the page with the form was delivered with.

This problem can be solve this setting the same character encoding in the ServletRequest object yourself. An easy solution is to implement a Filter for this which is mapped on an url-pattern of /* and basically contains only the following lines in the doFilter() method:

if (request.getCharacterEncoding() == null) {
    request.setCharacterEncoding("UTF-8");
}
chain.doFilter(request, response);

JSP/Servlet response

In the response processing an average application server will by default use ISO 8859-1 to encode the response outputstream. But is possible to force the response encoding to UTF-8. In the JSP adding the following line to the top of the file .JSP is sufficient:

<%@ page pageEncoding="UTF-8" %>

This will set the response outputstream encoding to UTF-8 and set the HTTP response content-type header to text/html;charset=UTF-8. Is possible also to apply this setting globally so that you don’t need to edit every individual JSP, just add the following entry to your /WEB-INF/web.xml file:

<jsp-config>
    <jsp-property-group>
        <url-pattern>*.jsp</url-pattern>
        <page-encoding>UTF-8</page-encoding>
    </jsp-property-group>
</jsp-config>

If case of a HttpServlet instead of a JSP to generate HTML content using out.write(), out.print() statements and so on, then set the encoding in the ServletResponse object itself inside the servlet method block before you call getWriter() or getOutputStream() on it:

response.setCharacterEncoding("UTF-8");

HTTP content-type header
The HTTP content-type header should instruct the Browser at the client side which character encoding to use for display. The Browser must use it above any specified HTML meta content-type header as specified by w3 HTML spec chapter 5.2.2.

Browser must observe the following priorities when determining a document’s character encoding (from highest priority to lowest):

  • An HTTP “charset” parameter in a “Content-Type” field.
  • A META declaration with “http-equiv” set to “Content-Type” and a value set for “charset”.
  • The charset attribute set on an element that designates an external resource.

just add the following HTML meta content-type header to your JSP anyway:

<meta http-equiv="content-type" content="text/html; charset=utf-8">
About these ads
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: