Class GeneralUnicodeString
java.lang.Object
net.sf.saxon.regex.UnicodeString
net.sf.saxon.regex.GeneralUnicodeString
- All Implemented Interfaces:
CharSequence, Comparable<UnicodeString>, AtomicMatchKey
A Unicode string which, in general, may contain non-BMP characters (that is, codepoints
outside the range 0-65535)
-
Field Summary
Fields inherited from interface AtomicMatchKey
NaN_MATCH_KEY -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncharcharAt(int index) Returns thecharvalue at the specified index.booleanisEnd(int pos) Ask whether a given position is at (or beyond) the end of the stringintlength()Returns the length of this character sequence.subSequence(int start, int end) Returns a newCharSequencethat is a subsequence of this sequence.toString()intuCharAt(int pos) Get the character at a specified positionintuIndexOf(int search, int pos) Get the first match for a given characterintuLength()Get the length of the string, in Unicode codepointsuSubstring(int beginIndex, int endIndex) Get a substring of this stringMethods inherited from class UnicodeString
asAtomic, compareTo, containsSurrogatePairs, equals, hashCode, makeUnicodeString, makeUnicodeStringMethods inherited from interface CharSequence
chars, codePoints, getChars, isEmpty
-
Constructor Details
-
GeneralUnicodeString
-
GeneralUnicodeString
GeneralUnicodeString(int[] chars, int start, int end)
-
-
Method Details
-
uSubstring
Description copied from class:UnicodeStringGet a substring of this string- Specified by:
uSubstringin classUnicodeString- Parameters:
beginIndex- the index of the first character to be included (counting codepoints, not 16-bit characters)endIndex- the index of the first character to be NOT included (counting codepoints, not 16-bit characters)- Returns:
- a substring
-
uCharAt
public int uCharAt(int pos) Description copied from class:UnicodeStringGet the character at a specified position- Specified by:
uCharAtin classUnicodeString- Parameters:
pos- the index of the required character (counting codepoints, not 16-bit characters)- Returns:
- a character (Unicode codepoint) at the specified position.
-
uIndexOf
public int uIndexOf(int search, int pos) Description copied from class:UnicodeStringGet the first match for a given character- Specified by:
uIndexOfin classUnicodeString- Parameters:
search- the character to look forpos- the first position to look- Returns:
- the position of the first occurrence of the sought character, or -1 if not found
-
uLength
public int uLength()Description copied from class:UnicodeStringGet the length of the string, in Unicode codepoints- Specified by:
uLengthin classUnicodeString- Returns:
- the number of codepoints in the string
-
isEnd
public boolean isEnd(int pos) Description copied from class:UnicodeStringAsk whether a given position is at (or beyond) the end of the string- Specified by:
isEndin classUnicodeString- Parameters:
pos- the index of the required character (counting codepoints, not 16-bit characters)- Returns:
- true iff if the specified index is after the end of the character stream
-
toString
- Specified by:
toStringin interfaceCharSequence- Overrides:
toStringin classObject
-
length
public int length()Returns the length of this character sequence. The length is the number of 16-bitchars in the sequence.- Returns:
- the number of
chars in this sequence
-
charAt
public char charAt(int index) Returns thecharvalue at the specified index. An index ranges from zero to length() - 1. The firstcharvalue of the sequence is at index zero, the next at index one, and so on, as for array indexing.If the
charvalue specified by the index is a surrogate, the surrogate value is returned.- Parameters:
index- the index of thecharvalue to be returned- Returns:
- the specified
charvalue - Throws:
IndexOutOfBoundsException- if the index argument is negative or not less than length()
-
subSequence
Returns a newCharSequencethat is a subsequence of this sequence. The subsequence starts with thecharvalue at the specified index and ends with thecharvalue at index end - 1. The length (inchars) of the returned sequence is end - start, so if start == end then an empty sequence is returned.- Parameters:
start- the start index, inclusiveend- the end index, exclusive- Returns:
- the specified subsequence
- Throws:
IndexOutOfBoundsException- if start or end are negative, if end is greater than length(), or if start is greater than end
-