graph representation of knowledge

We hope that our proposal to NASA is funded, since this would allow the start-up of the company Ontologystream to complete its first phase. Otherwise Ontologystream Inc may be dissolved this year (after three years of work on getting it started.) We had tried hard, over the past three years, but the business environment has been difficult. The proposal involves a small team, each member is well known to the others. So if I were to accept an academic appointment this year, the funding would be transferred to my new home university.

If funded, we expect to be able to demonstrate the use of Referential Information Bases (RIBs), and Orbs (Ontology referential bases) as a means to encode the informational states related to knowledge production and propagation. In this case, the knowledge will be about Earth Observational Data, and the underlying physical processes that are known or conjectured to be casual in the phenomenon being observed.

A class of “data mining” transformations will be defined as convolutions. The members of the class of these convolutions are then articulated within the Goldfarb notational system. Earth science discovery will be accompanied by explicit procedures that allow the discovery to be repeated.

On Patents

There will be two outstanding patent issues, one related to the 1993 Applied Technical Systems’ patent on what they call Contiguous Connection Model (CCM) and the other on the 2003 Primentia patent on representing an ASCII string as a number in base 64, and using this representation to produce a “key-less” hash table. However, there are a number of additional information technology innovations, including Prueitt’s voting procedure for categorization using subfeatures, the Orb generalizations of CCM, and the (unpublished) generalization of the Primentia patent using non-number ordered sets of symbols. [1]

This unpublished work is related to the published work on generalFramework (gF) theory. If one understands the generality of frameworks, see the paper, one can see also that primitives, labels and types (to use Goldfarb’s language) can be descriptively enumerated and then form a RIB, that uses no numerical model. Therefore there would be no infringement on any existing patent.

Several of the keys to understanding how Goldfarb’s work helps in the development of a knowledge operating system lay in what Prueitt regards are generalization of these two patents. He is not making a case for by-passing either or both of these patents, as in both cases one would expect to be able to find value in using the patent as a foundation for new market development. However, in both cases, the core issues of knowledge representation are only partially addressed.

A complete theory of knowledge representation would appear to be on the horizon, and can be achieved by carefully integrating the elements discussed in the NASA proposal. The foundational notation would seem to be an integration and extension of both the Orb notation and the Goldfarb notation, along lines suggested by Goldfarb. Specifically, Goldfarb advocates the development of notational formalism whose roots are not related to the numerical model. (page 2)

In the next section, an application of stratified theory is suggested in several areas; including a new type of bioinformatics that follows the suggestions by Goldfarb that a history of structural transformations is part of what is essential in the development of computer models of complex metabolic processes.

There are some surprises that come from the Orb/ETS notation. All, or almost all, known data encoding, and data mining algorithms, have a numerical model. The Orb/ETS structural formalism do not necessarily have a numerical model.

In fact the encoding of structural information may involve the notion of category, and of relationships between categories, without any appeal to any property of numbers.

Orbs combine with categoricalAbstraction and eventChemistry to create new types of “data mining”. Qualitative Structure Activity Analysis (Q-SAR) then may be used to examine the “double articulation” of structure as (1) simply structure, ie what is in a compound, and (2) function, ie how does this compound as a whole acting within an environment.

The path here is a long one, and involves the development of basic notational tools that reflect deep and sometimes esoteric principles, without there being, what some would consider to be, an immediate business model. However, the business model is there. The initial “business” problem is that a rather fundamental shift, in fact a shift that involves several deep and radical changes in cultural/science paradigm, is required to even start talking about how to use the notational tools.

We, a small community of basic researchers, feel that the numeric model underlying the current generation of algorithms profoundly limits the concepts expressed by the Artificial Intelligence and Semantic Web communities. We are close to having mature technology that can compete with existing technology. We are close, particularly in the commercial application of structural formalisms, such as the Orbs and Goldfarb’s ETS, in the development of anticipatory web services.

The foundational, mathematics and logic, breakthrough has already occurred. Many simple demonstrations of the consequences of the breakthrough have been developed and shown. However, the current community of program managers and MBAs, venture capital, has a type memetic immunology that sees this work as a threat to the status quo that they can ignore.

The relationship between bioinformatics, medical science, and the measurement of invariance

Lev Goldfarb makes the point in his papers and in discussions that the Evolving Transformational Systems (ETS) notation is conceived as both “stratified” and “level” independent. At first, I questioned his statement that the new formalism has to be level independent. I was thinking about the issues in a different way, and it took a while for me to see the principled position that Goldfarb had taken. I now completely agree that stratified notation must be level independent. Level dependant notation and notation related to specific level to other specific level has yet to be developed. My intuition is that in each case, these notational constructions will be directly derived from an measurement of structural invariance.

There are some issues that can and should be talked about in university courses that I hope to develop, as part of the National Project.

I have agreed with Goldfarb that a stratified formalism should be level independent. There should be a pure abstraction over exactly what is level independent. A moment’s reflection will reveal that one thing that is not “level independent” are the classical notions of location and distance. There is still an arrow to the expression of time, but the temporal scales are not level independent. The comprehension of the concepts being discussed do not come naturally to someone educated within our educational system. The curriculum for the National Project is there in the scholarly literature, but is not represented in the K-12 curriculum, and is not addressed in most college courses.

But the concepts are also simple and straight forward, if only we where not “educated: to question the principle that stratification must depend. This principle states the apparent enigma, “there are things that do not exist”; because what one means by “exist” has to be conditioned by what level that one’s awareness is directed at. Quantum mechanics should set aside the Newtonian notion of existence as something that conserves action/reaction, but our popular notions of causation are not set aside. The conservations are conserved within level of organization, so one expects that the laws of nature are specific and knowable. This is what we find, in regard to conservation of energy/mass for example. But what about intentions, are good intentions conserved? This question is not well formed.

In some specific way, the “pragmatics” of a specific level will have different content. So formalism that is “not stratified” will be captured as specific notation and have phenomenon specific theory related to specific types of phenomenon. Newtonian physics is the best example of a non-stratified theory. It is a point widely discussed that the Newtonian does not bring a fundamental understanding of the nature of life. This discussion is carried on within various camps and the viewpoints vary. The knowledge science curriculum will help focus this debate.

To a great extent, classical theories of bioinformatics are “as if there is only one scale”, and as a consequence the problems related to the indeterminism of some state transitions have been ill posed within bioinformatics. The same problem exists in linguistics, and yet here the problem is most properly recognized as “double articulation”. But linguistics is not the only field to recognize a “double articulation” phenomenon. In some important research literatures the independence of function from structure is explicitly recognized. However a full recognition of stratified theory by the science community has been avoided. We do not know how to proceed if the concept of a single level of phenomenon is abandoned. There must be something to move into.

There are things like human consciousness, which may exist at the same time on several levels of physical-energy organization (private conversations with Karl Pribram). Penrose and Hameroff have worked on these issues in the development of a theory of emergence called “self orchestrated collapse”. The theory directly challenges the modern foundation of science, ie the numeric model in many ways. New curriculum is needed to set the stage so that these challenges can be addressed by individuals who a required background. The issues cannot be resolved up front, before we see the evolution of a science of knowledge systems. But they can be “bracketed” as something that scholars will need to resolve at some point in history.

At this point, all we are asking for is an agreement that these are open problems, not as many in the AI and Semantic Web community would have us believe, eg that most knowledge representation problems are almost completely solved and that they are the ones who knows how to do this.

Ok, so what is stratified theory?

The first implication that comes from a search for a level independent notational theory is that the formalism should allow the creation of pattern recognition transforms and would allow one to compose these transforms into a string of graphs, as discussed by Goldfarb.

The best full presentation is http://www.cs.unb.ca/~goldfarb/ets2/ .

Let us consider again the diagram:

.

Figure 2: Goldfarb’s Evolving Transformation System (2004)

The string of graphs in the left part of the Figure 2 has temporal information, ie state p₃ occurs before p₄, and structural information. In this case each transformation has two affordances leading “into” the event and two affordances leading “out of” the event. (The notion of affordance is developed by the ecological physics community and is based on the 1950s work of J. J. Gibson.)

In Goldfarb’s notation one has the ability to define a theory of category type and assign this type to lines of affordance. Because this notation is level independent it has been possible for Goldfarb to suggest a stratified notational system that does the accounting for how these lines of affordances work in the cross scale mechanics related to emergence and dissipation.

A structural model is needed to capture information about the lines of affordance. Anticipatory responses are “ontologically” possible due to a regularity of system evolution. This regularity is expressed structurally and functionally; structurally at one level of organization and functionally at another.

The natural universe is not deterministic, but when systems are in specific states they have tendencies to evolve to specific other states. The Goldfarb evolutionary transformation system notation is designed to do ‘structural accounting”. Points of ambiguity, having a structural constraint of indeterminacy, are simply underdetermined. The notational accounting allows different structures to fulfill the same functional need, and allows for functional needs to shift when the required structural fulfillments are not present. The transformation outcome can be realized using more than one set of fulfillments. The fulfillment has to select only one of these sets, or to create some deviation of function or structure so that some set of forces achieve a balance, even if only a temporary balance. And in each case, that actually happens then become part of the information that propagates and constrains future or neighboring transformations.

Bioinformatics, as in other natural sciences, must be able to delineate the set of structural fulfillments and the function/structure aspect related to transformations that make these fulfillments as a matter of empirical science.

The Goldfarb notation accounts for the set of affordances going into a stable state and the set of affordances leading away from a stable set (see Figure 3). Clearly things can become complicated. A representation of a state, such as a mental event (or the metastable state of a redopsin protein in the retina) is incomplete without structural information and information about the function of structure within an environment. The logic that tracks structure/function relationships is not the classical logic of Aristotle, and perhaps has not been properly defined.

However, our work on the Russian work on Mill’s and Peircean logic does allow a predicative inference about the function of compounds assembled from a set of primitives. Goldfarb’s work appears to us to be consistent with this work, called by Dmitri Pospelov and Victor Finn “plausible reasoning” and “applied semiotics”.

The representations of both the structure and the function of compounds are complex in a technical sense. The representation itself might be considered an information state that indicates two quite different aspects of one thing, the thing itself and the thing in the context of its role within a larger system of things. One need not be philosophical here. A state is a structural pattern. We define a state to be something that “reoccurs”. States reoccur, and the precision of the reoccurrence is “measured” in a “comparison” to a prototypical instance of the state. Information is the structural relationships in specific structures and how these “sit” within a larger universe. The measurement is made to reconciliate these two aspects, not philosophically but as part of a real physical process.

It is a matter of empirical observation about the nature of the structures involved.

We are hoping that abstraction will allow a discussion of principles that correspond to physical realities, including the physical realities related to the emergence, period of stability, and collapse of any class of complex events.

Figure 3: Diagram from Chapter 4 (1996), Foundation of Knowledge Science

A comparison to the next figure can help understand the nature of stratification. The image in Figure 3 is the holonomic aspect (Pribram’s use of the term) of an anticipatory response. The components are the categoricalAbstraction atoms.

.

Figure 4: Prueitt’s Anticipatory Web of information (2004)

The drawing in Figure 3 suggests that most things are part of a series of expressions where some cohesive envelope, as in Maturana’s term “autopoiesis”, maintains the image of the self/system.

Each element in this series is the product of an evolving transformational system, and can be formalized in the Goldfarb notation. The ETS notation encodes the past history of the object, that one now has in front of him/her self, in real time. The encoding of these event states involves a measurement of the set of affordances that are involved in transforming physical reality into localized structure as this localized structure performs functional requirements being imposed by the larger system in which this local structure finds its existence.

The work by Lev Goldfarb on Evolving Transformational Systems

Lev Goldfarb’s home page.

Basic paper.

[1] The Primentia patent depends on regarding the ASCII string as a base 64 number, and ordering the hash containers according to the integer order. The Orb generalization uses that fact that an integer has properties that one does not need. One can put an order on a set without creating numbers. So a set of primitive symbols can be assigned an order, and then strings of these symbols can be assigned the inherited order in a standard way. The result is the same as achieved using the Primentia patent, except one never uses a “number”, only an ordering of a set of primitive symbols. The layering of semantics on the chosen order can add value to the data encoding, as is done in the Orbs, and the development of partial orders also allows un-expected surprises.