Exact difference between CharSequence and String in java [duplicate]

感情迁移 提交于 2019-11-26 19:22:23

General differences

There are several classes which implement the CharSequence interface besides String. Among these are

  • StringBuilder for variable-length character sequences which can be modified
  • CharBuffer for fixed-length low-level character sequences which can be modified

Any method which accepts a CharSequence can operate on all of these equally well. Any method which only accepts a String will require conversion. So using CharSequence as an argument type in all the places where you don't care about the internals is prudent. However you should use String as a return type if you actually return a String, because that avoids possible conversions of returned values if the calling method actually does require a String.

Also note that maps should use String as key type, not CharSequence, as map keys must not change. In other words, sometimes the immutable nature of String is essential.

Specific code snippet

As for the code you pasted: simply compile that, and have a look at the JVM bytecode using javap -v. There you will notice that both obj and str are references to the same constant object. As a String is immutable, this kind of sharing is all right.

The + operator of String is compiled as invocations of various StringBuilder.append calls. So it is equivalent to

System.out.println(
  (new StringBuilder())
  .append("output is : ")
  .append((Object)obj)
  .append(" ")
  .append(str)
  .toString()
)

I must confess I'm a bit surprised that my compiler javac 1.6.0_33 compiles the + obj using StringBuilder.append(Object) instead of StringBuilder.append(CharSequence). The former probably involves a call to the toString() method of the object, whereas the latter should be possible in a more efficient way. On the other hand, String.toString() simply returns the String itself, so there is little penalty there. So StringBuilder.append(String) might be more efficient by about one method invocation.

tl;dr

One is an interface (CharSequence) while other is a concrete implementation of that interface (String).

CharSequence animal = "cat"  // `String` object presented as the interface `CharSequence`.

As an interface, normally the CharSequence would be more commonly seen than String, but some twisted history resulted in the interface being defined years after the implementation. So in older APIs we often see String while in newer APIs we tend to see CharSequence used to define arguments and return types.

Details

Nowadays we know that generally an API/framework should focus on exporting interfaces primarily and concrete classes secondarily. But we did not always know this lesson so well.

The String class came first in Java. Only later did they place a front-facing interface, CharSequence.

Twisted History

A little history might help with understanding.

In its early days, Java was rushed to market a bit ahead of its time, due to the Internet/Web mania animating the industry. Some libraries were not as well thought-through as they should have been. String handling was one of those areas.

Also, Java was one of the earliest production-oriented non-academic Object-Oriented Programming (OOP) environments. The only successful real-world rubber-meets-the-road implementations of OOP before that was some limited versions of SmallTalk, then Objective-C with NeXTSTEP/OpenStep. So, many practical lessons were yet to be learned.

Java started with the String class and StringBuffer class. But those two classes were unrelated, not tied to each other by inheritance nor interface. Later, the Java team recognized that there should have been a unifying tie between string-related implementations to make them interchangeable. In Java 4 the team added the CharSequence interface and retroactively implemented that interface on String and String Buffer, as well as adding another implementation CharBuffer. Later in Java 5 they added StringBuilder, basically a unsynchronized and therefore somewhat faster version of StringBuffer.

So these string-oriented classes are a bit of a mess, and a little confusing to learn about. Many libraries and interfaces were built to take and return String objects. Nowadays such libraries should generally be built to expect CharSequence. But (a) String seems to still dominate the mindspace, and (b) there may be some subtle technical issues when mixing the various CharSequence implementations. With the 20/20 vision of hindsight we can see that all this string stuff could have been better handled, but here we are.

Ideally Java would have started with an interface and/or superclass that would be used in many places where we now use String, just as we use the Collection or List interfaces in place of the ArrayList or LinkedList implementations.

Interface Versus Class

The key difference about CharSequence is that it is an interface, not an implementation. That means you cannot directly instantiate a CharSequence. Rather you instantiate one of the classes that implements that interface.

For example, here we have x that looks like a CharSequence but underneath is actually a StringBuilder object.

CharSequence x = new StringBuilder( "dog" );

This becomes less obvious when using a String literal. Keep in mind that when you see source code with just quote marks around characters, the compiler is translating that into a String object.

CharSequence y = "cat";  // Looks like a CharSequence but is actually a String instance.

There are some subtle differences between "cat" and new String("cat") as discussed in this other Question, but are irrelevant here.

Class Diagram

This class diagram may help to guide you. I noted the version of Java in which they appeared to demonstrate how much change has churned through these classes and interfaces.

Text Blocks

Other than more and more emoji and other characters that have come with successive versions of Unicode support, in recent years not much has changed in Java for working with text… until Java 13.

Java 13 may offer a preview of the new feature: text blocks. This would make writing embedded code strings such as SQL more convenient. See JEP 355.

This effort was preceded by JEP 326: Raw String Literals (Preview).

Francisco Spaeth

CharSequence is a contract (interface), and String is an implementation of this contract.

public final class String extends Object 
    implements Serializable, Comparable<String>, CharSequence

The documentation for CharSequence is:

A CharSequence is a readable sequence of char values. This interface provides uniform, read-only access to many different kinds of char sequences. A char value represents a character in the Basic Multilingual Plane (BMP) or a surrogate. Refer to Unicode Character Representation for details.

assylias

other than the fact that String implements CharSequence and that String is a sequence of character.

Several things happen in your code:

CharSequence obj = "hello";

That creates a String literal, "hello", which is a String object. Being a String, which implements CharSequence, it is also a CharSequence. (you can read this post about coding to interface for example).

The next line:

String str = "hello";

is a little more complex. String literals in Java are held in a pool (interned) so the "hello" on this line is the same object (identity) as the "hello" on the first line. Therefore, this line only assigns the same String literal to str.

At this point, both obj and str are references to the String literal "hello" and are therefore equals, == and they are both a String and a CharSequence.

I suggest you test this code, showing in action what I just wrote:

public static void main(String[] args) {
    CharSequence obj = "hello";
    String str = "hello";
    System.out.println("Type of obj: " + obj.getClass().getSimpleName());
    System.out.println("Type of str: " + str.getClass().getSimpleName());
    System.out.println("Value of obj: " + obj);
    System.out.println("Value of str: " + str);
    System.out.println("Is obj a String? " + (obj instanceof String));
    System.out.println("Is obj a CharSequence? " + (obj instanceof CharSequence));
    System.out.println("Is str a String? " + (str instanceof String));
    System.out.println("Is str a CharSequence? " + (str instanceof CharSequence));
    System.out.println("Is \"hello\" a String? " + ("hello" instanceof String));
    System.out.println("Is \"hello\" a CharSequence? " + ("hello" instanceof CharSequence));
    System.out.println("str.equals(obj)? " + str.equals(obj));
    System.out.println("(str == obj)? " + (str == obj));
}

I know it a kind of obvious, but CharSequence is an interface whereas String is a concrete class :)

java.lang.String is an implementation of this interface...

Consider UTF-8. In UTF-8 Unicode code points are built from one or more bytes. A class encapsulating a UTF-8 byte array can implement the CharSequence interface but is most decidedly not a String. Certainly you can't pass a UTF-8 byte array where a String is expected but you certainly can pass a UTF-8 wrapper class that implements CharSequence when the contract is relaxed to allow a CharSequence. On my project, I am developing a class called CBTF8Field (Compressed Binary Transfer Format - Eight Bit) to provide data compression for xml and am looking to use the CharSequence interface to implement conversions from CBTF8 byte arrays to/from character arrays (UTF-16) and byte arrays (UTF-8).

The reason I came here was to get a complete understanding of the subsequence contract.

From the Java API of CharSequence:

A CharSequence is a readable sequence of characters. This interface provides uniform, read-only access to many different kinds of character sequences.

This interface is then used by String, CharBuffer and StringBuffer to keep consistency for all method names.

In charSequence you don't have very useful methods which are available for String. If you don't want to look in the documentation, type: obj. and str.

and see what methods your compilator offers you. That's the basic difference for me.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!