Categorization in Cognitive Computer Science

Categorization
In Cognitive Computer Science

John F. Sowa
VivoMind LLC

UQÀM Summer Institute on Cognitive Science
30 June 2003

Categorization

Classification and categorization are fundamental to intelligence — in every species.

Similarity: Recognition that two stimuli are signs of the same category.

Identity: Recognition that two stimuli are signs of the same categories for all relevant purposes.

Generalization/specialization: Recognition that some category includes another.

Negation: Denial that some stimulus is a sign of some category.
Note: these four operations, when combined in all possible ways, are sufficient to define first-order logic.

Categorization and Reasoning

Deduction: Applying a general principle to a special case.

Induction: Deriving a general principle from special cases.

Abduction: Guessing that some general principle can relate a given pattern of cases.

Analogy: Finding a common pattern in different cases.

Peirce's Logic of Pragmatism

Sensory-Reasoning-Motor Cycle

Induction: From observations to generalizations to the "knowledge soup."

Abduction: Extract hypotheses from the soup to form a tentative theory.

Revision: More abductions to revise the theory.

Deduction: Use the theory to make predictions.

Action: Test predictions by changing the world.

Repeat from line #1.

Replacing Sherlock Holmes

A Big Categorization Project

Cyc project started in 1984 by Doug Lenat.

Name comes from the stressed syllable of encyclopedia.

Goal: implement the commonsense knowledge of an average human being.

After $65 million and 650 person-years of work,
600,000 categories
defined by 2,000,000 axioms
organized in 6,000 microtheories.

But it cannot compete with a 10-year-old child.

Cyc Review

Two-day DARPA-sponsored review of Cyc in June 2003 with about two dozen AI experts.
Consensus:

Cyc is a unique and valuable resource:
A great deal has been learned from it.
Much more can be learned from it.
If it were canceled, something like it would have to be done again.

Support for Cyc should be continued.

Cyc should be freely available for research purposes.

But there are many questions about the relationship of Cyc to other R & D efforts.

Lexical Resources

Developers of WordNet (George Miller) and FrameNet (Chuck Fillmore) were also present.
Consensus:

Lexical resources are complementary to Cyc.

Extremely valuable for natural language projects.

Desirable to integrate contributions from various sources.

Integration would require relatively modest funding.

Word senses (synsets) can be linked to the categories of Cyc and other axiomatized ontologies.

Feigenbaum's Question

Ed Feigenbaum asked why Cyc has taken so long to become "intelligent".

In 1961, I. J. Good made a prediction:
It is more probable than not that, within the twentieth century, an ultraintelligent machine will be built and that it will be the last invention that man need make.

Why hasn't Good's prediction come to pass?

Is there some missing ingredient that the AI community hasn't discovered?

What is it? Could it be added to Cyc?

Cyc's Piece of the Pie

Cyc does not replace Sherlock Holmes.

It requires people like him to write axioms.

At a cost of $10,000 to encode one page from a textbook.

Ibn Taymiyya Contra Aristotle

Fourteenth century Moslem legal scholar.

Admitted that deduction is necessary for pure mathematics.

But for reasoning about the world, deduction is limited to the accuracy of the induction.

Given the same data, analogy can replace induction + deduction.

Ibn Taymiyya's Argument

A theory can be useful, if available.

But analogy can be used when no theory exists.

VivoMind Analogy Engine

Three methods of analogy:

Matching labels:

Compare type labels on conceptual graphs.

Matching subgraphs:

Compare subgraphs independent of labels.

Matching transformations:

Transform subgraphs.

Methods #1 and #2 take (N log N) time.
Method #3 takes polynomial time (analogies of analogies).

Analogy of Cat to Car

Cat Car

head hood

eye headlight

cornea glass plate

mouth fuel cap

stomach fuel tank

bowel combustion chamber

anus exhaust pipe

skeleton chassis

heart engine

paw wheel

fur paint

VAE used methods #1 and #2.
Source data from WordNet mapped to CGs.

Matching Labels

Corresponding concepts have similar functions:

Fur and paint are outer coverings.

Heart and engine are internal parts with a regular beat.

Skeleton and chassis are structures for attaching parts.

Paw and wheel support the body, and there are four of each.

Matching Subgraphs

A pair of isomorphic subgraphs:

Cat: head → eyes → cornea.

Car: hood → headlights → glass plate.

Approximate match (missing esophagus and muffler):

Cat: mouth → stomach → bowel → anus.

Car: fuel cap → fuel tank → combustion chamber → exhaust pipe.

Relating Different Representations

Method #3 for relating data structures that represent equivalent information.

A structure described in different ways:

English description: "A red pyramid A, a green pyramid B, and a yellow pyramid C support a blue block D, which supports an orange pyramid E."

A relational database would use tables.

But many different options for chosing tables, rows and columns, and labels for the columns.

Representation in a Relational DB

CG Derived from Relational DB

CG Derived from English

"A red pyramid A, a green pyramid B, and a yellow pyramid C support a blue block D, which supports an orange pyramid E."

The Two CGs Look Very Different

CG from RDB has 15 concept nodes and 8 relation nodes.

CG from English has 12 concept nodes and 11 relation nodes.

No label on any node in the first graph is identical to any label on any node in the second graph.

But there are some structural similarities.

VAE uses method #3 to find them.

Transformations Found by VAE

Top transformation applied to 5 subgraphs.
Bottom one applied to 4 subgraphs.
One application could be due to chance, but 4 or 5 contribute strong evidence for the mapping.

Evolutionary Pragmatism

Worm: sensory-motor cycle.
Fish: sensory-analogy-motor cycle.
Mammal: sensory-reasoning-motor cycle.
Human: sensory-induction-abduction-deduction-motor cycle.
Higher organisms include all the capabilities of the lower forms.

References

Paper on analogical reasoning by Sowa and Majumdar:
http://www.jfsowa.com/pubs/analog.htm

Paper on ontology, metadata, and semiotics:
http://www.jfsowa.com/ontology/ontometa.htm

Peirce's tutorial on existential graphs, with commentary by Sowa:
http://www.jfsowa.com/peirce/ms514.htm

Selected papers by Peirce on semeiotic and related topics; see his 1903 lectures on pragmatism in vol. 2 for material related to this talk:
Peirce, Charles Sanders (EP) The Essential Peirce, ed. by N. Houser, C. Kloesel, and members of the Peirce Edition Project, 2 vols., Indiana University Press, Bloomington, 1991-1998.

Cyc web sites:
http://www.cyc.com/
http://www.opencyc.org/

WordNet web site:
http://www.cogsci.princeton.edu/~wn/

FrameNet web site:
http://www.icsi.berkeley.edu/~framenet/

Copyright ©2003, John F. Sowa

Cat	Car
head	hood
eye	headlight
cornea	glass plate
mouth	fuel cap
stomach	fuel tank
bowel	combustion chamber
anus	exhaust pipe
skeleton	chassis
heart	engine
paw	wheel
fur	paint

Categorization In Cognitive Computer Science

Categorization

Categorization and Reasoning

Peirce's Logic of Pragmatism

Sensory-Reasoning-Motor Cycle

Replacing Sherlock Holmes

A Big Categorization Project

Cyc Review

Lexical Resources

Feigenbaum's Question

Cyc's Piece of the Pie

Ibn Taymiyya Contra Aristotle

Ibn Taymiyya's Argument

VivoMind Analogy Engine

Analogy of Cat to Car

Matching Labels

Matching Subgraphs

Relating Different Representations

Representation in a Relational DB

CG Derived from Relational DB

CG Derived from English

The Two CGs Look Very Different

Transformations Found by VAE

Evolutionary Pragmatism

References

Categorization
In Cognitive Computer Science