org.jmol.adapter.readers.cifpdb
class CifReader.RidiculousFileFormatTokenizer extends Object
regarding the treatment of single quotes vs. primes in cif file, PMR wrote:
* There is a formal grammar for CIF (see http://www.iucr.org/iucr-top/cif/index.html) which confirms this. The textual explanation is
14. Matching single or double quote characters (' or ") may be used to bound a string representing a non-simple data value provided the string does not extend over more than one line.
15. Because data values are invariably separated from other
tokens in the file by white space, such a quote-delimited
character string may contain instances of the character used
to delimit the string provided they are not followed by white
space. For example, the data item
_example 'a dog's life'
is legal; the data value is a dog's life.
[PMR - the terminating character(s) are quote+whitespace.
That would mean that:
_example 'Jones' life'
would be an error
The CIF format was developed in that late 1980's under the aegis of the International Union of Crystallography (I am a consultant to the COMCIFs committee). It was ratified by the Union and there have been several workshops. mmCIF is an extension of CIF which includes a relational structure. The formal publications are:
Hall, S. R. (1991). "The STAR File: A New Format for Electronic Data Transfer and Archiving", J. Chem. Inform. Comp. Sci., 31, 326-333. Hall, S. R., Allen, F. H. and Brown, I. D. (1991). "The Crystallographic Information File (CIF): A New Standard Archive File for Crystallography", Acta Cryst., A47, 655-685. Hall, S.R. & Spadaccini, N. (1994). "The STAR File: Detailed Specifications," J. Chem. Info. Comp. Sci., 34, 505-508.
Field Summary | |
---|---|
int | cch |
int | ich |
int | ichPeeked |
String | str |
String | strPeeked |
boolean | wasUnQuoted |
Method Summary | |
---|---|
String | fullTrim(String str)
specially for names that might be multiline
|
boolean | getData()
general reader for loop data
fills loopData with fieldCount fields
|
String | getNextDataToken()
first checks to see if the next token is an unquoted
control code, and if so, returns null
|
String | getNextToken() |
String | getTokenPeeked() |
boolean | hasMoreTokens() |
String | nextToken()
assume that hasMoreTokens() has been called and that
ich is pointing at a non-white character. |
String | peekToken()
just look at the next token. |
void | setString(String str)
sets a string to be parsed from the beginning
|
String | setStringNextLine()
sets the string for parsing to be from the next line
when the token buffer is empty, and if ';' is at the
beginning of that line, extends the string to include
that full multiline string. |
Parameters: str
Returns: str without any leading/trailing white space, and no '\n'
Returns: false if EOF
Throws: Exception
Returns: next data token or null
Throws: Exception
Returns: the next token of any kind, or null
Throws: Exception
Returns: the token last acquired; may be null
Returns: TRUE if there are more tokens in the line buffer
Returns: null if no more tokens, "\0" if '.' or '?', or next token
Returns: next token or null if EOF
Throws: Exception
Parameters: str
Returns: the next line or null if EOF
Throws: Exception