<Book Index>

 

Chapter 5

 

A General Framework for Computational Intelligence

Re-edited Sunday, May 04, 2003

.

 

Section 1: Introducing a Process Model for Human-mediated Information Production

 

The previous chapters establish some preliminary notions that serve as a background from the natural sciences to the computer sciences.  In this chapter we will work on the issue of representing knowledge as data structure and then using this data structure as machine ontology for various purposes. 

 

First, let us look at measurement, instrumentation, representation and encoding as a single step.  This step is often not examined in detail by all of the stakeholders involved in the use of the information.  So the systems developed in this way have limitations.  Why the measurement, instrumentation, representation and encoding are not done well involves two aspects.  The first is cultural, and often this means either institutional resistance or design practices by companies who provide information tools.  The second aspect to why this is not done well has to do with both the level of comprehension that we have about human knowledge experience.

 

The current level of development of computer science is also a factor. Many have discussed the issue of whether computer science had been developed based on a foundation that is not consistent with natural science, in particular the life sciences.  This issue of foundations is addressed in the BCNGroup roadmap for semantic technology adoption [1]

 

How the data is encoded into active memory or as a file (perhaps using just ASCII letters) is important.  If one has a standard relational database, the instrumentation and measurement are really combined in the form of source or input data.  The representation is occurring sometimes in the mind of a human who is formulating input, or by a machine process that is acting according to some set of rules. 

 

The open source movement has been attempting to improve the quality of information systems.  But there are issues that control the quality of measurement, instrumentation, representation and encoding.  These issues are beyond what current computer science has been.  A resolution of these issues is where we feel the foundations of knowledge science are to be found. 

 

If the data is coded poorly, then a set of limitations arise that are due to the data encoding being poor.  We will look at the theme vector representation of meaning in a moment.  The problem, as we will see, is not simply that the relational model is too strong to be agile; but that the numerical model is used in a way in which it should not be used. 

 

XML is of great value because it represents the localization of information as something that can be given a name and properties.  Resource Description Framework (RDF) and Topic Maps make a further refinement in what can be done using an underlying graph theoretical model.  The nodes of the graph are the localization of knowledge about something, the labels can be properties and other metadata, and the links can be relationship variables that can occur (or are occurring) between localizations.  We will here refer to a node as a location in order to reinforce the generality of the graph model.

 

XML is a standard for placing structured information into a text file.  The text files of this sort can then be parsed so as to produce something based on the information that is made accessable by the XML.  But what is essential about how the information can be placed into XML?  The answer has a lot to do with having an agreed upon standard for expressing information is a highly structured way.  A tag for the information acts as a type of object, and in fact the attributes of the tag can be used as if these attributes are internal data and even internal procedures (or methods).  So the mental visualization of a subject (to use the Topic Maps term from a “node”) is given a form that allows a specific evocation of human mental experiences.  Much like writing the words of a natural language onto a piece of paper.  Words are written into a structure that is loose and has a great deal of variation.  Community agreement about the use of words and grammar is why the representation of information in language is useful. 

 

The meaning of the words comes form the interpretation of information by a mind.  This being said, however; one recognizes the great value that computer based knowledge representation has if the computer can be used to move the information around and produce secondary results such as the algorithmic clustering together of words that are co-occurring in a text [2] and the retrieval of information that is stored remotely.

 

Text mining and knowledge representation has been trying to address this task so that knowledge referenced by in natural language can be referenced in ways at were not anticipated during the production of the natural language.  This has been, so far, a hard problem; largely because the cognitive aspects and the social aspects of knowledge representation. 

 

The knowledge representation problem has been addressed using graph theoretical constructs and some other methods, which we will discuss in this chapter.  The notion of differential and formative ontological models is developed as a means to tied together the graph theoretical model (of implicit ontology) with a continuum mathematical model of the nearness and other types of “relationships” that occur in language use, and in the modeling of structure to function relationships.  The concepts of categoricalAbstraction (cA) and eventChemistry( EC) are also introduced, in this chapter, to help formulation a general framework for computational intelligence has is grounded in the remarks of the previous four chapters. 

 

Detecting facts and events, producing/using models, discovering relationships lead to the development of ontological work product.  The work product needs to have an encoding that allows reuse in the various contexts as noted.  Generating options in a decision support environment, developing reporting mechanisms and auxiliary work product, and triggering consequences.

 

But measurement, instrumentation, representation and encoding we establish fidelity between contexts and knowledge representation. 

 


Section 2: Process Model, measurement, instrumentation, representation and encoding

 

 Let us start this discussion by looking closely at part of the process model in Figure 1.  We should compare closely the type of theory that has been developed from the previous chapters and think carefully about what instrumentation and measurement means in this context.  The issue of representation and encoding are so tightly linked to understanding and performance that we have to look at measurement in this light.

 

 

Figure 1: A process model

 

Our encoding strategy is to develop syntagmatic representations in the form of

 

O = { < c1, r, c2 > }.

 

that can exist in either mono-level or tri-level knowledge structures. The difference being that in tri-level structures the set O is regenerated in real time and made subject to "pragmatics" at the point of decision, whereas in mono-level structures the set O is a predefined resource supporting decision making.  

 

The different can be seen in the notion of static verses dynamic.  A static ontology would have the form of O but it’s origins and consequences might not be present.  One represents the ontological model as a reference to a real world.  So the representational issue over static verse dynamic is necessary. 

 

How situatedness is to be brought into the tri-level knowledge structure is still an open question; however, we have thought about the use of the theory of singular perturbation from dynamical systems, as one possibility. 

 

Again, consider a set of “q” coupled oscillators whose spin is described by the follow (simple) differential equation. 

 

dj/dt = w + SUM( c G(j)),  i = 1, . . . , q

 

(with j the oscillation phase, w the intrinsic (constant) oscillation, c the coupling and G any non-linear function), having various types of network connections (architectures) and initial conditions. The architecture would be expressed in coupling that may be positive or negative. The coupling may also be variable and reflect certain regular features of the circuit dynamics of metabolic reactions.

 

The global behaviors of the system of systems are observed to partially or fully synchronize the relative phase of individual oscillations.

 

We assume that the system develops systemic and regular behavior that acts as a partial control over the values that the coupling takes on. This control is from the higher level (of organization), of the two levels, to the lower lever.  A middle “level” emergences has the system defines itself and separates itself from an environment.  Thus we use of the term “tri-level”.  

 


 

 

Section 3:

Associative memories between theme space and semantic space

Editing in progress

 

 

 

(link to discussion on generative methodology)


Section 4: Definitions and Theorems on the decomposition of a SLIP data set

 

This is edited and extended from “Definitions and Theorems on the decomposition of a SLIP data set: Summary of Results, July – October 2001, SLIP Intrusion Detection Technology (74 pages).   

 

The SLIP technology does a Fourier type decomposition of all invariance in the data set into clusters that have tight inter-cluster linkage and weak intra-cluster linkage.  This invariance is encoded as the nodes of a tree (see Figure 2). 

 

 

Figure 2: A trace route event that is gathered into category D1

 

In Figure 2 we see the SLIP interface version 1.6 (October 17th, 2001).  The software is available from OntologyStream Inc.  By scrolling the top left window we would see that a RealSecure Intrusion Detection System (IDS) classified each of the 24 elements of category D1 as a Trace Route event.  The only computational means for gathering these elements together is the non-specific relationship between Source IP plus Source Port and Target IP plus Target Port. 

 

In the notational paper [3] we develop a notation for theme space analysis where each occurrence of type is given its own dimension.  The SLIP work has a similar encoding process.  We encode the data into a Hilbert space, as before; but we start with the rows and columns of the flat table. 

 

In SLIP we start the differential ontology by assuming a flat data exists with columns defining “type” and in each column we have a specific (finite but open) sets of “values” for each type.  One can take each cell of the flat table as being defined uniquely by a column and a row.  Spreadsheets often regard the column and row as defining a cell.  The cell can then be defined by the CCM (Contiguous Connection Model) as ( type : value ) pairs. 

 

SLIP works between the two types to identify the simplest link analysis having a correlation between a value in one type and distinct values in the second type.  The process in words is something like this:

 

“Select two types. Select one type to be a relationship.  Look at each value in the first type and check to see if this value is ‘occurs with’ a value in the second type.  If this happens more than once, then localize new information in the form

 

< ( type(2) : value(2) ), ( type(1) : value(1) ), (type(2) : value(2) ) >

 

The type information is discarded because the information is localized around the correlation between values.  One may invert the type value relationship and encode the values as numbers in a Hilbert line with the old type information as a linked packet.  However, we are in the simplest case, and are considering only two types at a time.  The number of values however can be large (or small.) 

 

The ( type : value ) pairs can be drawn from a relational database, from text files, or form native CCM databases.  The notion of occurrence can be defined as appropriate in each case.  Occurrence is a measurement and instrumentation aspect of the AIPM. 

 

In SLIP we encode new information from existing information with the transformation

 

( a1 , b ) +  ( a2 , b )  à   < a1 , r,  a2 >

 

where a(1) and a(2) are the shared values (of type two) for a single value (of type one).  The type one value is then deemed to be a relationship between the two type two values. 

 


Section 5: Technical observation and theorems

 

On the question of non-crisp categories:  The clusters themselves are gathered from the separate of high level categories within the limiting distribution of the scatter gather process.  The set of all of these clusters will have pairs of clusters that have non-empty intersections.  What lies outside of a core might be considered to be an environment and may have significance.  However, we focus first on the cores of these categories.

 

Figure 3: Characterization of a core

 

Our method for automatically generating a framework to start analysis was chosen to eliminate the environment.   This process is an essential part of the process of categorical abstraction.  We were looking for what stays the same as one moves from one event type to another.  The category is about that sameness.  The core is the center of a category where this center is invariant across several limiting distributions.  The purpose of the SLIP Framework is to make available to domain experts the invariance that is produced by the non-specific relationship. 

 

The set of ending nodes (sometimes called the leafs) of the SLIP Tree produces a crisp partition of the original set of atoms.  This is because children are always produced in such a way as to provide this crisp partition with no overlap between the memberships of the tree leafs.  The memberships of the tree leafs also will union to produce the complete membership of the parent node in the SLIP Framework tree.   This is the important notion of disjoint union. 

 

A series of theorems can be given.  However, the observation now is that a non-crisp partition can be developed in order to reflect what we conjecture would be category entanglement (again look at the separation of context and core seen in figure 2). 

 

In more advanced implementation of the fundamental SLIP theory, one can use feature analysis and a voting procedure to rout new events into a type of non-crisp classification based on profiles of categories.  This introduces the notion of a tri-level architecture for information routing, categorization and information retrieval.  These more advanced implementations await the completion of the first fully stand alone full function SLIP interface.  The reason why we mention the advanced implementation is that when the SLIP Frameworks are being created and shared, [4]then a number of intelligent programs will be possible.  

 

 

Figure 5: The SLIP Interface taking comments on an event category

 

The SLIP methods produce context free core category memberships that appear in different environments.  The method is a direct intersection of sets rather than a statistical method.  

 

Metadata can be associated with these nodes.  This was done, in the 2002 prototype, using a scripting language from the command line (see Figure 5).  Comments made by analysts are appended to a file pointed by a metadata tag within the node tag in XML.

 

On the fundamental SLIP theory:  A set of definitions establishes an abstract mathematical language.  This language is useful for two basic reasons:

 

1)     The language allows one to conjecture about and prove properties that one can find experimentally by developing algorithms and software

2)     The language allows a peer review of the underlying intuitions about how the SLIP technology might be used.

 


Section 6: Primary concepts:

 

Datamart:  The SLIP datamart consist of a table with two columns.  In the non-database version of the SLIP technology, an CCM repository is used instead of a relational table.  In the ful text version an ACSII text file is the datamart.  

 

The selection of the datamart is important.  The following issues might have an impact on what type of data is selected.

 

1)     A time period may delimit an event or group of events.

2)     A domain expert may analyze the nature of the data itself and produce an Analytic Conjecture that relates two types. 

 

In the relational model the types are columns.  For example the first column value might be the defensive addresses and the second column value might be the system calls that occurred during the time interval in question.  In a CCM repository, a “type” are represented by the set of all ( type : value ) pairs where the first part of pair has the same ASCII string.  Category theory can make an equivalence relationship between the elements of a thesaurus ring.  So in this case, the CCM type is defined by an equivalence relationship.  Similar variation in type and value representation can be instrumented for full text parsing.

 

It is important to realize that much of the value of the SLIP technology will come about when domain experts develop data with specific investigations in mind. 

 

Non-specific relations: Pairs of first column values are identified by parsing the Datamart and finding occurrences where a second column value (b) appears in more than one record.

 

The critical issues are only that each line in the Pairs text file represents a record and that the record represents a single event.

 

 

Figure 5:  The non-specific relationship between the atom a1 and a2

 

Two values from the first column are paired if the associated value in the second column is the same.  This situation is represented in Figure 5. 

 

Once this pairing is done, then the pairs are used to specify atoms.  The atoms are those elements that are in one or more pairs. 

 

Formally we have:

 

( a1 , b ) +  ( a2 , b )  à   < a1 , r,  a2 >

 

where r is the non-specific relationship.

 

Pairs.dbf   Pairs.dbf (formerly called Two.dbf) is the data source that contains the pairs of secondnames  The table is denoted with a script bold Q.

 

Set of Atoms  Q is parsed to find all occurrences of second column values.  These are added into the Atoms.dbf.  This set is determined by the occurrences of the non-specific relationship, and often times the set of atoms is a small percentage of all second column values that are in the mart.  The percentage is sometimes 1/1000, for example.  Thus a much larger data set is reduced in size at the very beginning of the SLIP computations.  The set of atoms is denoted with a script bold A. This reduction of data into “informational units” is not statistical.  Data aggregation is a filtering process that uses the non-specific relationship to pull things that are related but only related by this one well-defined non-specific relationship. 

 

All of this computational process is formal.  Not meaning is assigned until the domain expert looks that the membership, using the membership to produce a report from the original data source, and making human judgments about the nature of the core category.  These judgments are then collected if the domain expert types into the comment property of the core category. 

 

The notions of nearness and the topological notion of analytic mathematics is used to produce a retrieval of elements into a core category.  The sets of elements produced in this way will be actually related in exactly the fashion defined by the chaining occurring through the non-specific relationship. 

 

Ratio atoms/secondname:  This ratio is a computed value for any mart table.  This ratio can be computed and used to tune the import of SLIP data marts.  For example, consider the dataset that produces the sample SLIP Framework (see Figure 1).  The mart columns were originally selected to reflect scanning for Trojans and the follow-up use of a port identified in Trojan scans.  A quick computation from RealSecure database reports can indicate if there is significant Trojan scanning in this dataset.  If there is and one wishes to compute the full SLIP Framework, then this is possible. 

 

This ratio and others like it can be used to produce a “selective attention” that automates some of the intelligence functions of ID, particularly those involving either data visualization or link analysis. 

 

Distribution of A on the circle:  The first use, that I know of, of the circle for scatter gather was by myself in 1996.  The technique has not been publicly reported as yet.  Scatter gather generally requires both a pushing apart of atoms and a pulling them together.  However, due to the one point compactification of the line interval, a manifold with no boundary, a pulling together is sufficient to separate those groups that are interlinked (called prime cores.)  The use of the collection Q (derived from link analysis) to gather was invented (by Prueitt) in 2001, as far as I know.

 

Scatter/gather:  The scatter is done into a manifold where a distance between atoms is well defined. This is done randomly to eliminate the meaning of relationships that might be implied by the initial distance between atoms.  The scatter does in fact uniquely define the distance between any two pairs from A.  Moreover we have a common manifold metric on the entire set of atoms.  This collective metric is to be contrasted with the pair wise metric given in Q.  The pairwise metric is used (again in a stochastic fashion) to induce an organization of the topology of the manifold.  This induced organization is inherited from the pair wise metrics. 

 

Halting condition:  The Halting Condition occurs when the gather process, if continued will not move any of the atoms.  Three formal objects are required to define the halting condition.

 

a)     The paired atoms that are in the Pair table, Q.

b)    A subset of A identified from Q

c)     A distribution, D, of these atoms in a manifold

 

At any one time, the atom distribution in the manifold will define a distance between every pair of atoms.  The Pair table is used to randomly find pairs of points that have the non-specific relationship.  When such a pair is found, then the two points associated with the two atoms are moved closed together by a little bit.  (This is like simulated annealing in the artificial neural network systems).  If the two points are already at the same location, then nothing will happen to the distribution.  

 

The halting condition is a simple notion in the case of a well-linked set of atoms, such as the defined “prime cores” in our example.  However, the notion is not exactly simple in other cases.  For example, the halting condition can be characterized with a “chain condition” and this re characterization used to stop the gather process before the halting condition is reached. 

 

A set of atoms placed into a manifold does NOT have the halting condition with-respect-to Q if and only if there does exist a pair in the pair table where the first element and the second element are NOT at the same location in the manifold.  As the gather process proceeds, the set may, or may not, reach the halting condition at a later iteration.

 


Section 7: The SLIP Theorems

 

One needs three objects, A, Q, and the distribution of A in the manifold.  The manifold changes due to iterations of the gather, and thus an index will uniquely specify the manifold at iteration i.  We can denote the ith distribution as D(i). So the triple (A, Q, D(i) ) has the halting condition if and only if

 

(A, Q, D(i) ) = (A, Q, D(j) )  for all j > i

 

Definition:  A subset S of size greater than 1 from (A, Q, D(*))  is said to be  prime if and only if the there exist an index k such that (A, Q, D(k)) has the halting condition AND all of the elements of S are gathered to the same location. 

 

As Theorem 1 shows the union of primes will be prime only a special case. 

 

Figure 5: See Theorem 1

 

Theorem 1:  Let C(1), C(2), and C(3) be three distinct clusters.  Suppose that when each is treated by itself that it is prime.  The union of the three will be prime if and only if there does exist a chain, of length 2,

 

{ <a’, r, a(1)>, <a(1), r, a”> }

 

where a’ is in C(1), a(1) is in C(2), and a’ is in C(3)  .

 

Corollary 2.1: Theorem 2 shows one can create a halting condition by removing elements. 

 

Theorem 2: Let S be the subset of (A, Q, D(i))  whose members are part of pairs (a’, a”)  in Q where a’ and a” are not in the same location, and the part of the pair has no other chaining relationship to any atom having a chaining relationship to the other part of the pair.  (A -S, Q, D(j))  will have the halting condition for k > j for some j.  

 

Figure 6: See Theorem 2 and Definition of 2

 

The Figure 6 we illustrate the possibility that a set might be identified that when removed separates the set of atoms into three primes. 

 

Definition 2:  A prime set may be fractured into multiple prime sets given that one finds a set such as the subset S in Theorem 2.

 

Clearly the subset S will be the cause of future changes in the distribution.  .  (A -S, Q, D(i))  may be prime or composite but this condition can be immediately seen in the distribution , D(k)) , which is at the kth iteration pass a iteration that can be found eventually.  If prime, since we are at the halting condition, all of the atoms will be in exactly the same location.  If composite there will be primes (subsets of the distribution) that will each have the halting condition.  

 

Lemma 2.1  Let S be a subset that fractures A. Let a’ be an element of S, a” be an element of S, and < a’, r, a”> be an element of Q.  If there exist <a*, r, a”> in Q, then there does not exist <a’, r, a*> as an element of Q.

 

With these theorems and definitions we may obtain some insight into the SLIP algorithms. 

 

Let us review the SLIP process briefly.  We start with a set of atoms, A, and a set, Q,  of pairs of these atoms that have the non-specific relationship defined by a link analysis on the mart table.  On our example the size of A is 725.  The size of Q is 18,763.  The size of mart is 56,000 records. 

 

The ratio of Q to the set of all possible pairs from A is 18,763/525,625 or about 4/1000.  This is a probability that a randomly selected pair will have the non-specific relationship.  We use the pairs that are in Q to gather together atoms in the manifold.  At some point we stop this process and find a candidate for S is in Theorem 2. 

 

Finding exactly the situation in Theorem 2 still seems a bit difficult.  Computing chains and defining categories this way seems computational intensive, but so is the gather process.  However the heuristic of taking intersections across several separately computed limiting distributions seems to result in a practical solution that follows the theory exactly.  The computed cores are prime, but we get only a few at a time (at each level).  Furthermore, it is still not clear that the prime decomposition using stratified theory will produce THE unique result. 

 

The open question is whether or not we have a unique factorization theorem as one does in number theory. 

 

Conjecture 1 (October 17th, 2001):  There exist a unique decomposition of a set of atoms into primes using a purely algorithmic process.  This unique decomposition will be produced each time the SLIP algorithms are run, given that the algorithms find any splitting subset S that exists in any prime core. 

 

The SLIP algorithms and data structure form a new type of information system.  This system is a non-traditional database similar to various non-relational ( i.e., third normal form type relational databases) called Referential Information Bases (RIBs).  These RIB structures are being developed as static in-memory structures by a number of groups.  Query and data management features are different in that the RIBs do NOT allow delete or append functions.  These update functions are accomplished by completely unloading and remapping data into a formal finite state machine.  This process is slow compared to relational database updates. 

 

However, once the remapping occurs the update is complete and very fast data aggregation and emergent computing processes can occur.  The RIB technologies are being developed as a new generation of data warehousing technologies, where append and delete is managed in a batch process.  SLIP uses many of the concept that have been developed by others involved in the RIB-type technologies.

 

The following is some preliminary analysis about how the input data might be configured so as to make a specific search for events of a particular type. 

 

 

Figure 7: A chain relationship between a(1) and a(n)

 

The events in Figure 3 have a specific nature if the first and second columns are (1) attacker locations { x, a(1), a(2), a(3), a(4), . . . , a(n) } and (2) defender location, { y }. 

 

We have seen two different interpretations of a chain relationship of this type given these two columns in the data mart.  The first interpretation is about the identification of a Trojan and the consequent use of this knowledge by the attacker.  The second interpretation is about session hijacking. 

 

So what is the different between these two types of events? 

 

The first interpretation was the motivation for the example SLIP Framework that the demonstration (version 1.4) displays and navigates.  A port scan, from an source a(i), is used to trigger a response from a Trojan existing at location y.   Given a successful response from the scan, it is assumed that a(i) communicates off line to x and that x then addresses the same port at location y and establishes a session that in some way uses the Trojan. 

 

A different source IP , a(j), is used against a different target IP – again denoted in Figure 3 as y.  The identity of y is not used in the SLIP scatter gather and thus the fact that the y may change location is not accounted for.  We create a category Y, of y locations and treat any member of the category without distinction.

 

y(i), y(j) is an element of Y  à  y(i) – y(j)

 

If in each case, a single IP is used once a Trojan is identified, then we have established the chain relationship.  The source IP locations { x, a(1), a(2), a(3), a(4), . . . , a(n) } will form all or part of a prime core. 

 

The second interpretation is quite a bit different, and yet has almost the same characteristics when put into the SLIP Framework.  One form of session hijacking occurs when an SYN is sent from x to y but x is spoofing its location, using the location a(i). 

 

Some considerable work needs to be done in order to catch a reply from one of the spoofed locations. 

 

Rout traces will sometimes work, if the administrator has not turned off rout tracing.  Then given the ability of catch the reply, the source must manage to guess the session id number. Here is where a vulnerability of the attacker occurs.  The ability to guess the session id is completely dependent on there being very little time elapsed between the SYN and the ACK reply.  A SLIP signature for this type of element could be the membership in E4 of the first example SLIP Framework.

 

 

Figure 8: Five spoofed addresses chained to a port 1080 attack

 

Motivation:  The motivation for creating a batch of test sets and examining the cluster cores, is to classify types of attacks from the patterns given in chain relationships within the pair table.  These chain relationships are due to shallow link analysis.  The clustering merely shows us where to look.  The halting condition for clustering has been shown to be equivalent with a formal property about chain relations.  So one has a computational foothold on the chains themselves.  The chaining in turn reveals real, but non-specific linkage between data elements that exist in the data due to specific causes. 

 


Section 8: On response degeneracy

 

The notion of degeneracy is used by Nobel Laureate, Gerald Edelman, to indicate a one-to-many to many-to-one relationship that captures the flexibility that can be seen experimentally in the subcellular protein circuits that support the acquisition of specific real time connection patterns in brain regions in response to experience (Edelman, 1987). The word "degeneracy", when used in this way, points to stochastic theories of causation in which a probability distribution spreads potential along a finite and discrete set of paths, from one location - the present, into the future. This spread of paths from single nodes is realized in standard Bayesian analysis using graph and tress. Each node is characterized by a path leading from the past, or representation of the past, to the node (the present). From this node we have the n paths leading away into the future.

 

A representation of the concept of “response degeneracy

 

Associated with each path is a conditional probability. So on the abstract, the future is determined by a random variable that expresses these conditional probabilities. The state transition is degenerate in the same way that Edelman networks are, on the surface. But the mechanisms that express degeneracy in biology are quite different that the Bayesian inference engine.

 

The paths from each node, in a Bayesian network, is to yet another node; and thus the number of nodes soon explores. This requires a specific methodology to address this unrealistic character of the network. We can treat the nodes as states followed by gestures. The state is what is observed by an actor, something that can express an gesture to any state. The gestures and states are a composition of different substrate into an structure that is known via category policies.

 

We also see that an absence of flexibility characterizes methods that rely on theme representations of text for semantic content. Theme representations are generally products of linguistic analysis and/or knowledge engineering. Often what are being sought are the facts of the case; who said what and when. This surface is a one level representation of a complex system. The picture given is; however, not the complete picture of human inner thought or behavior. We immediately understand that word phrases, by themselves can not capture the degenerate set of all possible interpretations of the meaning of the author and the understanding of various types of readers. Variable response potential is necessary and is circumspect to subjective constraints. The question that the chapters of this book puts to the reader is regarding the conjecture that subjective constraints are possible in machine intelligence given that the intelligence is expressed in the tri-level architecture.

 

We make the claim that variable response is not possible, except through the development of a memory structure, where the invariant substructures are encoded and made available for real time remembering within the context established by variable category policies.

 

As to be discussed further in Chapter 10, the special Quasi Axiomatic Theories (QAT) were developed, in Russia, by Finn and his colleagues (Finn, 1991) in order to address the need for open logics. C. S. Peirce's foundational logics helped the Russian establish not only an open logic but also a system that is stratified. The QAT languages manage an assignment of meaningfulness during the aggregation of substructure, in a step by step fashion that allows the strict separation between logic atoms and evaluation functions. The assignment of truth-value is made under the rules of algorithms that are specified in degenerate situational logics. Moreover, the rules of deduction and the rules for assignment of meaning are to be modified according to the open systems theory developed, also in Russia, by Pospelov (1986). Thus specific interpretations, about a specific input, can be modeled as a simple constraint on a larger class of interpretations.

 

Response degeneracy is also constrained by the state of the environment, whether the metabolic environment or some interpretive environment such as mental events. In metabolic environments, a certain type of circuit dynamic exists where one state leads to another and environmental populations of reactant support each state. By in a degenerate case, the dynamic is uncompleted without additional constraint. As Edelman's work illustrates, these circuits exist as protein conformational state changes, and in metabolic reactions, in the immune and neural systems. The expression is governed ultimately by an image of self that is the complex expression of the whole system. In interpretive environments, we need a similar notion. This notion is the notion of a "system" image.

 

In order for a text understanding system to have a feature that is similar, to response degeneracy, we need situational logic. We also need syntagmatic representations of the form <a, r, b> where a and b are locations in a semantic net and r is a class of relationships that can reasonably exist between two concepts.

 

Proper methods for evolutionary linking of elementary syntagmatic units, into situational models, can be chosen after we handle the difficult software issues regarding theme representation and visualization.

 

Visualization tools for semantic spaces have a natural similarity to tools for visualizing chemical graphs. Scientific visualization has concentrated for two decades on visualization of knowledge about chemical graphs, and thus the technology and the methodology is widely used and understood. The cognitive graphs are more complex in certain ways, perhaps by an order of magnitude, but in other ways the chemical graph and the cognitive graph are exactly the same.

 

We expect that in the near future, that specific semantic net structures can be delivered to the human client via visualization tools designed for chemical graphs. The delivery to the user can be in the form of a text composed by automated means.

 



[1] See: http://www.bcngroup.org/area1/2005beads/GIF/RoadMap.htm

[2] See SLIP technology index at :  http://www.ontologystream.com/cA/index.htm

[3] See notational paper at : http://www.bcngroup.org/area2/KSF/Notation/notation.htm

[4] See the BCNGroup roadmap for adopting semantic technology at : http://www.bcngroup.org/area1/2005beads/GIF/RoadMap.htm