Nexus uses a couple different methods of parsing data. To parse, is to attempt to make atomic, meaningful elements from an otherwise continuous stream of near random values... a web browser parses these sentences to format the words so they will all fit on the page, wrapping if needed.
Some definitions may be useful before this content.
The first method of parsing gathers data into lines. Carriage return and new line characters are used to determine the end of lines.
The second method, much more complex, processes text into words, and phrases. A phrase is determined by matching sets of certain characters...
If any of these begin a phrase, and any other set is between, they will
be gathered into the phrase... for example ""...
will be a phrase, and can be further parsed into it's sub words...
Spaces and punctuation split lines into word phrases. Carriage return
characters are treated as space, and end previously accumulated words just
like spaces and tabs. Most punctuation marks ( % / , ; ! ? = + * & ~ # @ )
end words, and will become atomic unto themselves. Period has the property that
it can be part of an elipses (...) where more than one of the same punctuation
character may be in the same atom. A period or colon, will only behave like a
singlar punctuation mark when the next character is white space,
or a '%'.
This rule simplified collection of network address and filenames. A '%' always
introduces a variable reference. A '-' when followed by a number will not
be atomic unto itself, instead will be collected with a number, to allow parsing
of negative numbers. If the '-' and numeric character are seperated by a space,
the '-' will be atomic to itself and may have no relation to the number.
Revision may be required to this page...