(soon)
Compression of a representation of social discourse
Overlay of human inductive inference
Communicated to Industry: 12/27/2003 9:41 AM
The Ontology Referential Base (Orb) is a type of microscope for precise measurement of the co-occurrence of terms in any document collection. As such, Orb technology is new and unexpected.
There is a set and category theory foundation that makes the algorithmic development of an Orb very easy. Orbs are fast to develop unlike the precursor technologies, such as latent semantic indexing. The results are precise and understandable by humans.
The index and Orb construction can be as little as 8% of the memory footprint of the corresponding ASCII text.
An Orb can be disassociated from the word index and shifted from one target to another. The Orb footprint can be as small as 0.5% of the corresponding ASCII text. Orbs production is a true compression of information. Following a well known property of word usage in social discourse, the rate of increase in co-occurrence patterns declines to essentially zero unless the topic of discussion shifts in such a way as to create some new patterns.
Compression of a representation of social discourse as a function of time is possible, with a 100% syntactic retrieval over any co-occurrence pattern. This compression enables a poll-like analysis of social discourse
http://www.bcngroup.org/area2/KnowledgeEcologies.htm
The poll mechanism is done with bit maps that are very small, perhaps 2-3 Megs for all English posted on the Internet in one day. One of the properties of the Orb is that the specific information as to who is saying what, for example is lost in the process of producing the underlying categoricalAbstraction foundational to Orb production.
An inverted index on all of the data is necessary to gain specific knowledge about specific occurrences of a linguistic pattern. When privacy is a legal issue, the actual step required to look at specific private information is separated form the global analysis. The Orbs are separated from private information as a by-product of how the Orbs are generated.
Using an Orb, is then like using anything else, the negative or positive consequences are part of the use and not of the existence of the thing. Our social need is to know what is being discussed within groups who pose very real and immediate threat to world economic and social order.
Orbs have projection properties explained in the previous post. Most domain specific Orbs will occupy less than 100K of memory and be read from a plain ASCII text file.
The technology will support consensual mapping of knowledge flow within communities, provide security&privacy from the ground up and where desired.
Orb information is NOT semantic. It is purely syntactic. The Orb construction is an exact model of the co-occurrence of word patterns. The visualization of Orb construction is simple and completely understood by children.
Meaning and context can be overlaid with the newly emerging Human Mark-up Language standard.
A powerful type of natural inference can be overlaid using a Mill’s logic (plausible reasoning), following work on quasi-axiomatic logic developed by the applied semiotic school. This inference is, by its very nature, NOT a deductive inference in sense of the tradition of academic logic. The Orb theory of inference is grounded in modern cognitive neuroscience in such a fashion that accounts for the results surrounding Godel’s theorem in the foundations of logic and Bell’s inequality in the foundations of physics.
By relying on human visual and cognitive influence Orb inference takes on the nature of human abduction and induction.
Perhaps most interesting is the formal similarity between Orb encoding, transforms and use and the holographic model of human brain function developed by Karl Pribram.
Orb technology can be applied to scientific measurement of bio-reactivity and other physical process. When combined with stratified number theory, SLIP, categoricalAbstraction, and eventChemistry, the Orb notation is one of the foundational elements for the knowledge sciences.
Orbs with a few heuristics will easily stand up subject matter taxonomy through the analysis of existing community document repositories. These subject matter taxonomies are now either non-existent (as in the FCC internal document management system), or have been very expensive to generate. Often these taxonomies are subject to control by proprietary interests.
Once one has an Orb and any full text (inverted) index, then one can deliver a 100% syntagmatic precision/recall of the documents with the specific co-occurrence of two, three or more words within a variable scope.
http://www.bcngroup.org/python3/thirtynine.htm
A provisional patent covers the use of Orbs in this way.
The issue of variable scope, use of controlled vocabularies and computationally easy projections from the largest Orb, for a specific document collection, to “views” of the largest Orb are concisely given in the short link above.
The description of the process of producing Orbs once one has a controlled vocabulary is in the notational paper:
http://www.bcngroup.org/area2/KSF/Notation/notation.htm
InOrb Technologies needs two months to complete an Orb SDK and several examples of completed Orb products.
One of these will be an automated taxonomy generation product.
The investment we need as this point is $30,000 with a plan to return the investor $30,000 - $60,000 within 90 days by selling the company InOrb.com with the Orb SDK as the product.
The product will license the Orb technology from OntologyStream Inc. One of the customers for the InOrb SDK is dataRenewal Inc.
Both the dataRenewal and the InOrb investment template where produced by the not-for-profit corporation and scientific foundation, the BCNGroup.