Previous
comments. make a comment Research
note 25
Research Note 24
October 11, 2003
Supplemental
Report to the REAL Program at DARPA
The SAIC –
Ontologystream Proposal (September 3rd, 2003)
Design features of Class:Object pairing technology
Based on work we have
completed in the past several months, we have concluded that certain
adjustments need to be made to our program.
We have come to understand that the Continuous Connection Model (ATS’s
patented technology), while quite valuable, is subtly misaligned with our
design and needs to be substituted. A
detailed explanation of our critique and alternative solution is currently
under review by several leading scholars.
Here, we summarize our new understanding and indicate some of the
implications. With this adjustment, we
feel that deployable routines can soon be made available that serve important
intelligence tasks, such as detection of real time memetic patterns indicating
intent to conduct terrorism. (Note
on civilian uses for ORBs.)
A tacit assumption
underlying many text-processing systems is that structure in textual data is
sufficient for creating high-fidelity conceptual representation, and that the
contribution of the human analyst is not essential. Our assumption has always been different, that ontology services
need to supplement human reification, and that real linguistic variation can
best be found by allowing humans to interact with rules applied in multi-pass
parsing. Two input processes are needed, one that is human and one that is
algorithmic. The difficulty comes in
supporting this approach at every level.
(Note
on the use of the Actionable Intelligence Process Model (AIPM).)
Our system will now allow
humans to fortify conceptual representations at the base level. Encoding that is similar to the type:value
pairing in the ATS technology will now be substituted with a class:object
pairing methodology. This methodology
is interoperable with OWL (Ontology Web Language). In the listing of design features below, we offer several new
features that depend on the revised approach, and which are not features of the
CCM technology. We are confident that
these features can be delivered in our REAL system and will support
breakthrough applications.
This shift in our approach
creates advances elsewhere in our program.
Our original proposal called for the integration of ClearForest tools in
a Phase 2, but we will now be able to bring this important integration forward
into Phase 1 without change in cost.
Phase 1 will continue to
depend on the use of a unique Text Analysis International software system for
producing systems that have situational deep case grammar and ontology based
multi-pass parsing capabilities. We
will also continue to task Steven Newcomb to develop a Topic Map control module
for a conceptual roll-up using word-level n-grams, frames with slots and
fillers (as discussed by Roger Schank), but the module will now employ
class:object encoding and encoded data transformation methods.
We also plan to insert
highly agile rule engines as part of text pre-processing, algorithmic methods
based on latent semantic indexing for categorizing text into context, and a
distributed knowledge management system based on SchemaLogic’s data schema and
terminology reconciliation technology.
1: Localization of
information.
1.1: Input is
acquired from the user or from machine algorithm:
1.1.1: Provide
for human community based reconciliation
of control parameters using distributed controlled vocabularies.
1.1.2: Switch from
type:value to class:object terminology.
1.1.3: Use
general framework theory, similar to and modeled after the Zachman
and Ballad frameworks.
1.2: The type:value pairing model that ATS developed is to be exchanged for a publicly disclosed, not patented, class:object pairing technology
1.2.1: Ontology classes and ontology objects will be complemented with more complex metadata.
1.2.2: A
pointer, or hash key, will be placed into the class constructor so that
objects created will be accessable via a Berkley Data Base derived hash table
management system. This provides very fast access to the classes and objects
within the Ontology Reference Base.
1.3: A theory of
class:object may be developed to assist in the principled specification of type
and value during input and as part of a
control module:
1.3.1: Provide
interoperability between ontology object encoded information and Ontology Web
Language (OWL) constructions, including those depending on first order
predicate logics.
1.3.2: Provide
interoperability between ontology object encoded information and Topic Map
constructions including HyTime and Grove constructions
1.3.3: Provide for the inclusion of type and value
role specifications so as to enhance the model of linguistic variation as
applied in specific circumstances – such as modeling the social discourse
of terrorist cells.
2: Global organization of
ontology object constructions.
2.1: Provide a collective
view of all of the connections between objects where these connections are
determined by a first order predicate logic, using those objects that are
defined within the elementary atoms of a situational
logic.
2.2: Provide a
means to reify, by human inspection,
collective views of a collection of objects and to make changes in connections
manually.
2.3: Provide a
means to produce connections that would not otherwise exist, by indirect means
including latent
semantic indexing, ontology services, and continuum mathematical models of
linguistic variation.
3: Use of graph
constructions and transforms on graphs.
3.1: Develop and
encode into computer processes extended
methods for specifying relationships of various types using a theory of
type.
3.2: Link
objects having the nature of class:object
pairs, using convolution operators that has as the operator’s “domain” a
set of graphs, and has as the operator’s “range” a set of graphs.
3.3: Link
objects having the nature of class:object
pairs, using convolution operators that has as the operator’s domain a set
of objects, and has as the operator’s range a set of graph.