Copyright ©2003, John F. SowaCategorization
In Cognitive Computer Science
John F. Sowa
VivoMind LLC
UQÀM Summer Institute on Cognitive Science
30 June 2003
Categorization
Classification and categorization are fundamental to intelligence — in every species.
Note: these four operations, when combined in all possible ways, are sufficient to define first-order logic.
- Similarity: Recognition that two stimuli are signs of the same category.
- Identity: Recognition that two stimuli are signs of the same categories for all relevant purposes.
- Generalization/specialization: Recognition that some category includes another.
- Negation: Denial that some stimulus is a sign of some category.
Categorization and Reasoning
- Deduction: Applying a general principle to a special case.
- Induction: Deriving a general principle from special cases.
- Abduction: Guessing that some general principle can relate a given pattern of cases.
- Analogy: Finding a common pattern in different cases.
Peirce's Logic of Pragmatism
Sensory-Reasoning-Motor Cycle
- Induction: From observations to generalizations to the "knowledge soup."
- Abduction: Extract hypotheses from the soup to form a tentative theory.
- Revision: More abductions to revise the theory.
- Deduction: Use the theory to make predictions.
- Action: Test predictions by changing the world.
- Repeat from line #1.
Replacing Sherlock Holmes
A Big Categorization Project
Cyc project started in 1984 by Doug Lenat.
- Name comes from the stressed syllable of encyclopedia.
- Goal: implement the commonsense knowledge of an average human being.
- After $65 million and 650 person-years of work,
600,000 categories
defined by 2,000,000 axioms
organized in 6,000 microtheories.- But it cannot compete with a 10-year-old child.
Cyc Review
Two-day DARPA-sponsored review of Cyc in June 2003 with about two dozen AI experts.
Consensus:
- Cyc is a unique and valuable resource:
A great deal has been learned from it.
Much more can be learned from it.
If it were canceled, something like it would have to be done again.- Support for Cyc should be continued.
- Cyc should be freely available for research purposes.
- But there are many questions about the relationship of Cyc to other R & D efforts.
Lexical Resources
Developers of WordNet (George Miller) and FrameNet (Chuck Fillmore) were also present.
Consensus:
- Lexical resources are complementary to Cyc.
- Extremely valuable for natural language projects.
- Desirable to integrate contributions from various sources.
- Integration would require relatively modest funding.
- Word senses (synsets) can be linked to the categories of Cyc and other axiomatized ontologies.
Feigenbaum's Question
Ed Feigenbaum asked why Cyc has taken so long to become "intelligent".
- In 1961, I. J. Good made a prediction:
It is more probable than not that, within the twentieth century, an ultraintelligent machine will be built and that it will be the last invention that man need make.- Why hasn't Good's prediction come to pass?
- Is there some missing ingredient that the AI community hasn't discovered?
- What is it? Could it be added to Cyc?
Cyc's Piece of the Pie
- Cyc does not replace Sherlock Holmes.
- It requires people like him to write axioms.
- At a cost of $10,000 to encode one page from a textbook.
Ibn Taymiyya Contra Aristotle
- Fourteenth century Moslem legal scholar.
- Admitted that deduction is necessary for pure mathematics.
- But for reasoning about the world, deduction is limited to the accuracy of the induction.
- Given the same data, analogy can replace induction + deduction.
Ibn Taymiyya's Argument
- A theory can be useful, if available.
- But analogy can be used when no theory exists.
VivoMind Analogy Engine
Three methods of analogy:
- Matching labels:
- Compare type labels on conceptual graphs.
- Matching subgraphs:
- Compare subgraphs independent of labels.
- Matching transformations:
- Transform subgraphs.
Methods #1 and #2 take (N log N) time.
Method #3 takes polynomial time (analogies of analogies).
Analogy of Cat to Car
Cat Car head hood eye headlight cornea glass plate mouth fuel cap stomach fuel tank bowel combustion chamber anus exhaust pipe skeleton chassis heart engine paw wheel fur paint
VAE used methods #1 and #2.
Source data from WordNet mapped to CGs.
Matching Labels
Corresponding concepts have similar functions:
- Fur and paint are outer coverings.
- Heart and engine are internal parts with a regular beat.
- Skeleton and chassis are structures for attaching parts.
- Paw and wheel support the body, and there are four of each.
Matching Subgraphs
A pair of isomorphic subgraphs:
- Cat: head → eyes → cornea.
- Car: hood → headlights → glass plate.
Approximate match (missing esophagus and muffler):
- Cat: mouth → stomach → bowel → anus.
- Car: fuel cap → fuel tank → combustion chamber → exhaust pipe.
Relating Different Representations
Method #3 for relating data structures that represent equivalent information.
- A structure described in different ways:
- English description: "A red pyramid A, a green pyramid B, and a yellow pyramid C support a blue block D, which supports an orange pyramid E."
- A relational database would use tables.
- But many different options for chosing tables, rows and columns, and labels for the columns.
Representation in a Relational DB
CG Derived from Relational DB
CG Derived from English
"A red pyramid A, a green pyramid B, and a yellow pyramid C support a blue block D, which supports an orange pyramid E."
The Two CGs Look Very Different
- CG from RDB has 15 concept nodes and 8 relation nodes.
- CG from English has 12 concept nodes and 11 relation nodes.
- No label on any node in the first graph is identical to any label on any node in the second graph.
- But there are some structural similarities.
- VAE uses method #3 to find them.
Transformations Found by VAE
Top transformation applied to 5 subgraphs.Bottom one applied to 4 subgraphs.
One application could be due to chance, but 4 or 5 contribute strong evidence for the mapping.
Evolutionary Pragmatism
Worm: sensory-motor cycle.
Fish: sensory-analogy-motor cycle.
Mammal: sensory-reasoning-motor cycle.
Human: sensory-induction-abduction-deduction-motor cycle.
Higher organisms include all the capabilities of the lower forms.
References
Paper on analogical reasoning by Sowa and Majumdar:
http://www.jfsowa.com/pubs/analog.htmPaper on ontology, metadata, and semiotics:
http://www.jfsowa.com/ontology/ontometa.htmPeirce's tutorial on existential graphs, with commentary by Sowa:
http://www.jfsowa.com/peirce/ms514.htmSelected papers by Peirce on semeiotic and related topics; see his 1903 lectures on pragmatism in vol. 2 for material related to this talk:
Peirce, Charles Sanders (EP) The Essential Peirce, ed. by N. Houser, C. Kloesel, and members of the Peirce Edition Project, 2 vols., Indiana University Press, Bloomington, 1991-1998.Cyc web sites:
http://www.cyc.com/WordNet web site:
http://www.cogsci.princeton.edu/~wn/FrameNet web site:
http://www.icsi.berkeley.edu/~framenet/