|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.jruby.util.string.Ustr
public class Ustr
Ustr - rhymes with Wooster. Implements a string, with three design goals:
A Ustr is a fairly thin wrapper around a byte[] array, which contains null-terminated UTF8-encoded text.
Note that in the context of a Ustr, "index" always means how many Unicode characters you are into the Ustr's text, while "offset" always mean how many bytes you are into its UTF8 encoded form.
Similarly, "char" and "String" always refer to the Java constructs, while "character" always means a Unicode character, always identified by a Java int.
If any of the Ustr methods are passed an integer alleged to represent a Unicode character whose value is not a valid code point, i.e. is either negative or greater than 0x10ffff, the method will throw a UstrException, which extends RuntimeException and is thus not checked at compile-time.
For any method that copies characters and might overrun a buffer, a
"safe" version is provided, starting with an extra s
, e.g.
sstrcopy
and sstrcat
. These versions always
arrange that the copied string not overrun the provided buffer, which
will be properly null-terminated.
UstrException
,
Serialized FormField Summary | |
---|---|
int |
base
Where in the array s the string starts. |
int |
offset
To keep track of a single character position within the string; this is used by the nextChar and appendChar
methods. |
byte[] |
s
A null-terminated byte array containing the string in UTF-8 form. |
Constructor Summary | |
---|---|
Ustr()
Creates an empty Ustr with no buffer |
|
Ustr(byte[] bytes)
Wraps a Ustr around a buffer. |
|
Ustr(byte[] bytes,
int start)
Wraps a Ustr around a position in a buffer. |
|
Ustr(char[] chars)
Makes a Ustr from a char[] array. |
|
Ustr(int length)
Creates an empty Ustr, with a null termination at the front. |
|
Ustr(int[] ints)
Makes a Ustr from an int[] array, where each int is the value of a Unicode character. |
|
Ustr(int space,
java.lang.Object o)
Makes a Ustr from an object, based on its toString() ,
leaving room for growth. |
|
Ustr(java.lang.Object o)
Makes a Ustr from an object, based on its toString() . |
|
Ustr(Ustr from)
Makes a Ustr which is a copy of another Ustr |
Method Summary | |
---|---|
void |
appendChar(int c)
Append one Unicode character to a Ustr. |
static int |
appendChar(int c,
byte[] s,
int offset)
Writes one Unicode character into a UTF-8 encoded byte array at a given offset, and null-terminates it. |
int |
charAt(int at)
find the Unicode character at some index in a Ustr. |
int |
compareTo(java.lang.Object other)
Supports the Comparable interface. |
Ustr |
concat(java.lang.String str)
Append a String to the end of this. |
Ustr |
concat(Ustr us)
Append a Ustr to the end of this. |
boolean |
endsWith(java.lang.String suffix)
Test if this Ustr ends with specified suffix (a String). |
boolean |
endsWith(Ustr suffix)
Test if this Ustr ends with the specified suffix (a Ustr). |
boolean |
equals(java.lang.Object anObject)
Compares this Ustr to another object. |
byte[] |
getBytes()
Convert this Ustr into bytes according to the platform's default character encoding, storing the result in a new byte array. |
byte[] |
getBytes(java.lang.String enc)
Convert this Ustr into bytes according to the specified character encoding, storing the result into a new byte array. |
void |
getChars(int srcBegin,
int srcEnd,
char[] dst,
int dstBegin)
Copies Unicode characters from this Ustr into the destination char array. |
static void |
getChars(java.lang.String str,
int srcBegin,
int srcEnd,
char[] dst,
int dstBegin)
Copies Unicode characters from this String into the destination char array. |
int |
hashCode()
Returns a hashcode for this Ustr. |
int |
indexOf(int ch)
Returns the first index within this Ustr of the specified Unicode character. |
int |
indexOf(int ch,
int start)
Returns the first index within this Ustr of the specified character, starting at the specified index. |
int |
indexOf(Ustr us)
Returns the index within this Ustr of the first occurrence of the specified other Ustr, or -1. |
int |
indexOf(Ustr us,
int start)
Returns the index within this Ustr of the first occurrence of the specified other Ustr starting at the given offset, or -1. |
void |
init()
Empty a Ustr by setting its first byte to 0. |
Ustr |
intern()
returns a canonical version of the Ustr, which should be treated as read-only. |
int |
lastIndexOf(int ch)
Returns the index within this Ustr of the last occurrence of the specified Unicode character. |
int |
lastIndexOf(int ch,
int stop)
Returns the index within this Ustr of the last occurrence of the specified Unicode character before the specified stop index. |
int |
lastIndexOf(Ustr us)
Finds the last substring match. |
int |
lastIndexOf(Ustr us,
int stop)
Finds the last substring match before the given index. |
int |
length()
Length of a Ustr in Unicode characters (not bytes). |
static int |
length(byte[] b)
Number of Unicode characters stored in a byte array. |
static int |
length(byte[] b,
int offset)
Number of Unicode characters stored starting at some offset in a byte array. |
static int |
length(java.lang.String str)
Number of Unicode characters stored in a Java string. |
int |
nextChar()
Retrieve one Unicode character from a Ustr and advance the working offset. |
void |
prepareAppend()
Set up for appendChar . |
void |
prepareNext()
Set up for nextChar() . |
Ustr |
replace(int oldChar,
int newChar)
returns a new Ustr with all instances of one Unicode character replaced by another. |
byte[] |
sstrcat(byte[] to,
byte[] from)
Safely append one null-terminated byte array to another. |
static byte[] |
sstrcat(byte[] to,
int tbase,
byte[] from,
int fbase)
Safely append one null-terminated byte array to another with control over offsets. |
Ustr |
sstrcat(Ustr from)
Safely append one Ustr to another. |
static byte[] |
sstrcpy(byte[] to,
byte[] from)
Safely copy a null-terminated byte array. |
static byte[] |
sstrcpy(byte[] to,
int tbase,
byte[] from,
int fbase)
Safely copy null-terminated byte arrays with control over offsets. |
Ustr |
sstrcpy(Ustr from)
Safely copy in the contents of another Ustr. |
boolean |
startsWith(Ustr us)
Tests if other Ustr is prefix of this. |
boolean |
startsWith(Ustr us,
int start)
Tests if other Ustr is prefix at given index. |
static byte[] |
strcat(byte[] to,
byte[] from)
Copy one null-terminated byte array to the end of another. |
static byte[] |
strcat(byte[] to,
int tbase,
byte[] from,
int fbase)
Copy one null-terminated array to the end of another, with starting offsets for each |
Ustr |
strcat(Ustr other)
Append the contents of another Ustr to the end of this one |
static int |
strchr(byte[] b,
int c)
Find the offset where a Unicode character starts in a null-terminated UTF-encoded byte array. |
Ustr |
strchr(int c)
Locate a Unicode character in a Ustr. |
static int |
strcmp(byte[] s1,
byte[] s2)
Compare two null-terminated byte arrays. |
static int |
strcmp(byte[] s1,
int s1base,
byte[] s2,
int s2base)
Compare sections of two null-terminated byte arrays. |
int |
strcmp(java.lang.Object other)
Compare a Ustr to an object's String representation. |
int |
strcmp(Ustr other)
Compare two Ustrs. |
Ustr |
strcpy(byte[] from)
Copy in the contents of a null-terminated byte array. |
static byte[] |
strcpy(byte[] to,
byte[] from)
Copy a null-terminated byte array. |
Ustr |
strcpy(byte[] from,
int boffset)
Copy in the contents at some offset in a null-terminated byte array. |
static byte[] |
strcpy(byte[] to,
int tbase,
byte[] from,
int fbase)
Copy null-terminated byte arrays with control over offsets. |
static byte[] |
strcpy(byte[] b,
int offset,
java.lang.String s)
Load a null-terminated UTF-8 encoding of a String into a byte array. |
static byte[] |
strcpy(byte[] b,
java.lang.String s)
Load a null-terminated UTF-8 encoding of a String into a byte array at the front. |
Ustr |
strcpy(java.lang.Object o)
Copy in the String representation of an Object. |
Ustr |
strcpy(Ustr from)
Copy in the contents of another Ustr. |
int |
strlen()
The length in bytes of a Ustr's UTF representation. |
static int |
strlen(byte[] b)
The length in bytes of a null-terminated byte array |
static int |
strlen(byte[] b,
int base)
The length in bytes of a null-terminated sequence starting at some offset in a byte array. |
static int |
strrchr(byte[] b,
int c)
Find the index of the last appearance of a Unicode character in a null-terminated UTF-encoded byte array. |
Ustr |
strrchr(int c)
Locate the last occurrence of a Unicode character in a Ustr. |
static int |
strstr(byte[] big,
byte[] little)
locate a substring in a byte array. |
Ustr |
strstr(Ustr little)
Locate a substring in a string. |
Ustr |
substring(int start)
makes a new substring of a Ustr given a start index. |
Ustr |
substring(int start,
int end)
makes a new substring of a Ustr identified by start and end indices. |
char[] |
toCharArray()
converts a Ustr to a char array. |
java.lang.String |
toString()
Generates a Java String representing the Ustr. |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public byte[] s
public int base
s
the string starts. You can
have lots of different Ustrs co-existing in a single byte array.
public int offset
nextChar
and appendChar
methods.
Constructor Detail |
---|
public Ustr()
public Ustr(int length)
length
- length of the buffer, in bytespublic Ustr(byte[] bytes)
bytes
- the bufferpublic Ustr(byte[] bytes, int start)
bytes
- the bufferstart
- where in the buffer the strings startspublic Ustr(Ustr from)
from
- the Ustr to copypublic Ustr(char[] chars)
chars
- the char arraypublic Ustr(int[] ints)
ints
- the int array
UstrException
public Ustr(java.lang.Object o)
toString()
.
Most commonly used with a String argument. The Ustr is null-terminated,
but no space is allocated beyond what's needed. Throws a UstrException
if the environment doesn't support the UTF8 encoding.
o
- the Object
UstrException
public Ustr(int space, java.lang.Object o)
toString()
,
leaving room for growth. Most commonly used with a String argument.
The Ustr is null-terminated.
space
- How large a buffer to allocateo
- The objectMethod Detail |
---|
public void init()
public int compareTo(java.lang.Object other)
Comparable
interface. The ordering is that of
native Unicode code points and probably not culturally appropriate
anywhere.
compareTo
in interface java.lang.Comparable
other
- the object compared
public java.lang.String toString()
toString
in class java.lang.Object
UstrException
public int length()
public static int length(byte[] b, int offset)
b
- the byte arrayoffset
- where to start counting
public static int length(byte[] b)
b
- the byte array
public static int length(java.lang.String str)
s
is a String, s.length()
and
Ustr.length(s)
will be the same except when s
contains non-BMP characters.
str
- the string
public void prepareAppend()
appendChar
. Points the offset
field at the buffer's null terminator.
public void appendChar(int c)
offset
points to the null-termination,
where the character ought to go, updates that field and applies
another null termination. You could change the value of
offset
and start "appending" into the middle of a Ustr
if that's what you wanted. This generates the UTF-8 bytes from
the input characters.
If the character is less than 128, one byte of buffer is used. If less than 0x8000, two bytes. If less than 2**16, three bytes. If less than 0x10ffff, four bytes. If greater than 0x10ffff, or negative, you get an exception.
c
- the character to be appended.public static int appendChar(int c, byte[] s, int offset)
c
- the Unicode characters
- the arrayoffset
- the offset to write at
UstrException
public void prepareNext()
nextChar()
. Points the offset
field at the start of the buffer.
public int nextChar()
public int strlen()
public static int strlen(byte[] b)
b
- the array
public static int strlen(byte[] b, int base)
b
- the byte arraybase
- the byte offset to start counting at
public static byte[] strcpy(byte[] to, byte[] from)
to
- destination arrayfrom
- source array
public static byte[] strcpy(byte[] to, int tbase, byte[] from, int fbase)
to
- destination arraytbase
- starting offset in destination arrayfrom
- source arrayfbase
- starting offset in source array
public Ustr strcpy(Ustr from)
from
- source Ustr
public Ustr strcpy(java.lang.Object o)
o
- the source object
public Ustr strcpy(byte[] from)
from
- the byte array
public Ustr strcpy(byte[] from, int boffset)
from
- the source byte arrayboffset
- where to start copying in the source array
public static byte[] strcpy(byte[] b, java.lang.String s)
b
- the byte arrays
- the String
public static byte[] strcpy(byte[] b, int offset, java.lang.String s)
b
- the byte arrayoffset
- where in the byte array to loads
- the String
public Ustr sstrcat(Ustr from)
from
- the Ustr to be appended
public byte[] sstrcat(byte[] to, byte[] from)
to
- dest arrayfrom
- source array
public static byte[] sstrcat(byte[] to, int tbase, byte[] from, int fbase)
to
- dest arraytbase
- base of dest arrayfrom
- source arrayfbase
- base of source array
public static byte[] sstrcpy(byte[] to, int tbase, byte[] from, int fbase)
to
- destination arraytbase
- starting offset in destination arrayfrom
- source arrayfbase
- starting offset in source array`
public static byte[] sstrcpy(byte[] to, byte[] from)
to
- destination arrayfrom
- source array
public Ustr sstrcpy(Ustr from)
from
- source Ustr
public static byte[] strcat(byte[] to, int tbase, byte[] from, int fbase)
to
- destination arraytbase
- base pos of destinationfrom
- source arrayfbase
- base pos of source
public static byte[] strcat(byte[] to, byte[] from)
to
- destination arrayfrom
- source array
public Ustr strcat(Ustr other)
other
- the other Ustr
public static int strcmp(byte[] s1, byte[] s2)
s1
- first byte arrays2
- second byte array
public static int strcmp(byte[] s1, int s1base, byte[] s2, int s2base)
s1
- first byte arrays1base
- byte offset in first array to start comparings2
- second byte arrays2base
- byte offset in second array to start comparing
public int strcmp(Ustr other)
other
- the other Ustr
public int strcmp(java.lang.Object other)
other
- the other Object
public Ustr strchr(int c)
c
- the character, as an integer
public static int strchr(byte[] b, int c)
b
- UTF-encoded null-terminated byte array
public Ustr strrchr(int c)
c
- the character, as an integer
public static int strrchr(byte[] b, int c)
b
- the byte arrayc
- the integer
public Ustr strstr(Ustr little)
little
- the substring to be located
public static int strstr(byte[] big, byte[] little)
big
- the array to search inlittle
- the array to search for
public int charAt(int at) throws java.lang.IndexOutOfBoundsException
at
- the index
java.lang.IndexOutOfBoundsException
public Ustr concat(java.lang.String str)
str
- the string
public Ustr concat(Ustr us)
us
- the ustr to append
public boolean endsWith(Ustr suffix)
suffix
- the possible suffix.
public boolean endsWith(java.lang.String suffix)
suffix
- the possible suffix
public boolean equals(java.lang.Object anObject)
equals
in class java.lang.Object
anObject
- the other object
public byte[] getBytes()
public byte[] getBytes(java.lang.String enc) throws java.io.UnsupportedEncodingException
enc
- the encoding to use in generating bytes
java.io.UnsupportedEncodingException
public static void getChars(java.lang.String str, int srcBegin, int srcEnd, char[] dst, int dstBegin)
str
- the stringsrcBegin
- where to start copyingsrcEnd
- index after last char to copydst
- start of destination arraydstBegin
- where in the destination array to start copyingpublic void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
srcBegin
- where to start copyingsrcEnd
- index after last char to copydst
- start of destination arraydstBegin
- where in the destination array to start copyingpublic int hashCode()
hashCode
in class java.lang.Object
public int indexOf(int ch)
ch
- the character
public int indexOf(int ch, int start)
ch
- the characterstart
- where to start looking
public int indexOf(Ustr us)
us
- the other Ustr
public int indexOf(Ustr us, int start)
us
- the other Ustrstart
- the index to start looking
public Ustr intern()
public int lastIndexOf(int ch)
ch
- the character
public int lastIndexOf(int ch, int stop)
ch
- the characterstop
- last index to consider
public int lastIndexOf(Ustr us)
us
- the subtring to search for
public int lastIndexOf(Ustr us, int stop)
us
- the subtring to search forstop
- where to stop searching
public Ustr replace(int oldChar, int newChar)
oldChar
- the Unicode character to be replacednewChar
- the Unicode character to replace it with
UstrException
public boolean startsWith(Ustr us)
us
- the other Ustr
public boolean startsWith(Ustr us, int start)
us
- the other Ustrstart
- where to test
public Ustr substring(int start)
start
- index of start of substr
public Ustr substring(int start, int end)
start
- index of start of substrend
- index of end of substr
public char[] toCharArray()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |