INSTRUCTORS:
Prof. Anatoli Gorchetchnikov
Office: Rm. 213, 677 Beacon Street
Office hours: Monday 2-4pm, Friday 1-2pm, or by appointment (email works best)
Email: anatoli (at) cns (dot) bu (dot) edu
Dr. Heather Ames
Office: Rm. 308B, 677 Beacon Street
Office hours: Monday 2-4pm or by appointment (email works best)
Email: starfly (at) cns (dot) bu (dot) edu
TEACHING ASSISTANT:
None
COURSE DESCRIPTION:
CN550 develops neural network models of how internal representations of sensory events and cognitive hypotheses are learned and remembered, and of how such representations enable recognition and recall of these events. Various neural and statistical pattern recognition models, and their historical development and applications, are analyzed. Special attention is given to stable self-organization of pattern recognition and recall by Adaptive Resonance Theory (ART) models. Mathematical techniques and definitions to support fluent access to the neural network and pattern recognition literature are developed throughout the course. Experimental data and theoretical analyses from cognitive psychology, neuropsychology, and neurophysiology of normal and abnormal individuals are also discussed. Course work emphasizes skill development, including writing, mathematics, computational analysis, teamwork, and oral communication.
CLASS PROJECT:
CN550 includes a class project, as described in the accompanying materials. Part of each class is devoted to discussion of the class project and planning for the coming week. Each student will work in a group with one or two other students. Groups should plan to meet during the weekly discussion session and at other times, as needed.
COMPUTATIONAL WORKSHOPS:
Each class will conclude with a computational workshop.
HOMEWORK:
For the second part of the class the class project serves as homework. For the first part of the class a phase plane assignment is intended as multi-week homework due at midterm.
GRADING CRITERIA:
Grades are determined by performance on:
1. essay on readings -- 5%
2. homework assignment -- 10%
3. computational workshops -- 20%
4. class project -- 20%
5. midterm exam -- 20%
6. final exam -- 25%
Participation in class discussions will play a role in determining the final letter grade in borderline cases.
Late homework policy: 10% penalty if turned in less than one week late, 20% penalty for 1-2 weeks late, and 30% penalty for > 2 weeks late. No late homework will be accepted after the final exam.
REQUIRED TEXT:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
RECOMMENDED TEXTS:
Schacter, Daniel L. (1996) Searching for Memory: The Brain, the Mind, and the Past. New York: Basic Books. (paper).
Kandel, E., Schwartz,
J.H., and Jessell, T.M. (2000). Principles of Neural Science, 4th Edition.
Levine, D.S. (2000). Introduction to Neural and Cognitive Modeling, 2nd Edition. Hillsdale, NJ: Erlbaum.
Strunk, William, Jr., & White E.B. (1959-2000) The Elements of Style, Fourth Edition. Needham Heights, MA: Allyn & Bacon.
OTHER USEFUL RESOURCES:
APA style: http://www.apastyle.org/, http://www.bridgewater.edu/WritingCenter/manual/APAformat.htm
Hettich, S., & Bay, S.D. (1999) The UCI KDD Archive. Irvine, CA: University of California, Department of Information and Computer Science. http://www.ics.uci.edu/~mlearn/MLRepository.html
CN 710 materials
Fall 2008http://cns.bu.edu/cn710/Fall2008/
Fall 2007http://cns.bu.edu/cn710/Fall2007/
Fall 2006http://cns.bu.edu/cn710/Fall2006/pmwiki.php?n=Main.HomePage
Spring 2006 http://cns.bu.edu/cn710/Spring2006/pmwiki.php?n=Main.HomePage
OTHER USEFUL TEXTS:
Too many to list here, please download the pdf.
SESSION 1 (January 24) Overview, history, philosophy, benchmark database studies
Course goals, topics, methods, assignments.
Historical review of principal neural network modules for learning, pattern recognition, and associative memory.
Class project: Comparative studies of supervised learning systems.
Benchmark database studies.
Readings:
Daugman, John G. (1990) Brain metaphor and brain theory. In Eric Schwartz (Ed.) Computational Neuroscience. Cambridge, Mass. : MIT Press. Chapter 2: pp. 9-18.
Borges, Jorge Luis (1942) Funes, the Memorious. In: Ficciones (translation), New York: Grove Press (1962), pp. 107-115.
Henig, Robin Marantz (2004) The quest to forget. The New York Times Magazine, April 4, 2004, pp. 32-37.
Treffert, Darold A., and Christensen, Daniel D. (2005) Inside the mind of a savant. Scientific American, Dec., pp. 108-113.
Carpenter, Gail A. (1989) Neural network models for pattern recognition and associative memory. Neural Networks, 2, 243-257.
McCulloch, Warren S., & Pitts, Walter (1943) A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115-133.
Bower, Gordon H. (2000) A brief history of memory research. In Endel Tulving & Fergus I.M. Craik, (Eds.) The Oxford Handbook of Memory. New York: Oxford University Press, Chapter 1, pp. 3-32.
Smith, Edward E., & Medin, Douglas L. (1981) Categories and Concepts. Cambridge, Mass.: Harvard University Press. Chapters 1-2, pp. 1-21.
Grossberg, Stephen (1982) Studies of Mind and Brain. Boston: Reidel / Kluwer Publ. - Preface, Introduction, and prefaces of chapters 1-13.
Supplemental materials:
http://www.npr.org/templates/story/story.php?storyId=5352811
Unique memory lets woman replay life like a movie. Morning Edition, April 20, 2006 -- Neurobiologist James McGaugh, one of the world's experts on human memory, says that a woman he calls AJ has a one-of-a-kind memory. In an interview with NPR, she talks about what life is like for someone who can remember things she's done and news events from almost every day of her life for the past 25 years. Her life is like a split-screen movie, with the past running almost as vividly as the present.
Clive Wearing: Living without memory. YouTube (BBC -- The Mind): Pt2a Pt2b Pt2c Pt2d http://en.wikipedia.org/wiki/Clive_Wearing
Clive Alex Wearing (born 1938) is a British musicologist, conductor, and keyboardist suffering from an acute and long lasting case of anterograde amnesia. Specifically, this means he lacks the ability to form new memories, dubbed the "memento" syndrome by laypeople and the media, after a film based on the subject.
Lecture Notes:
SESSION 2 (January 31)
Supervised learning methods:-- Memory-based algorithms (KNN), model-independent supervised learning methods (validation & cross-validation, c-index, ROC curves, resampling, combining classifiers, component analysis), statistical pattern recognition.
Memory-based algorithms: K-nearest neighbors (K-NN)
Approaching supervised learning problems fairly and systematically
Training, testing, validation, and cross-validation
ROC curves and the c-index
Resampling:-- bootstrapping, boosting, bagging
Combining systems:-- mixing models and voting
Data preparation:-- component analysis
Brief introduction to statistical pattern recognition and Bayesian estimation
Readings:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
1. Section 2.8.3: Signal detection theory and operating characteristics, pp. 48-51.
2. Sections 3.1-3.4, pp 84-97.
3. Section 3.8: Component analysis and discriminants, pp. 114-124.
4. Section 4.1-4.6:-- Nonparametric techniques, pp. 161-192.
5. Section 9.4:-- Resampling for estimating statistics, pp. 471-475.
6. Section 9.5:-- Resampling for classifier design, pp. 475-482.
7. Section 9.6.2: Cross-validation , pp. 483-485.
8. Section 9.7: Combining classifiers, pp. 495-499.
Carpenter, Gail A., Grossberg, Stephen, Markuzon, Natalya, Reynolds, John H., & Rosen, David B. (1992) Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps.IEEE Transactions on Neural Networks, 3, 698-713.
http://en.wikipedia.org/wiki/Resampling_%28statistics%29
http://en.wikipedia.org/wiki/Bootstrap_aggregating
http://en.wikipedia.org/wiki/Boosting
http://en.wikipedia.org/wiki/Principal_Component_Analysis
http://en.wikipedia.org/wiki/Fisher_linear_discriminant
http://en.wikipedia.org/wiki/Maximum_likelihood
http://en.wikipedia.org/wiki/Bayes%27_theorem
Lecture Notes:
SESSION 3 (February 7) Unsupervised learning: Clustering (leader, K-means), competitive learning, ART
Competitive learning
Adaptive resonance theory - 1970s
ART 1:-- Binary pattern learning
ART 2-A:-- A fast, algorithmic version of ART 2
Freud's neural networks
Readings:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley. Section 10.4.3: k-means clustering, pp. 526-528.
Levine, Daniel S. (2000) Introduction to Neural and Cognitive Modeling. Hillsdale, NJ: Lawrence Erlbaum Associates, 2nd Edition.
1. Chapter 4: Competition, lateral inhibition, and short-term memory, pp. 95-154
2. Chapter 6: Coding and categorization, pp. 198-279
Malsburg, Christoph von der (1973) Self-organization of orientation sensitive cells in the striate cortex. Kybernetik, 14, 85-100.
Grossberg, Stephen (1976) Adaptive pattern classification and universal recoding, I: Parallel development and coding of neural feature detectors. Biological Cybernetics, 23, 121-134.
Grossberg, Stephen (1976) Adaptive pattern classification and universal recoding, II: Feedback, expectation, olfaction, and illusions. Biological Cybernetics, 23, 187-202.
Carpenter, Gail A., & Grossberg, Stephen (1987) A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 37, 54-115.
Moore, Barbara (1989) ART 1 and pattern clustering. In David S. Touretzky, Geoffrey Hinton, & Terrence Sejnowski (Eds.) Proceedings of the 1988 Connectionist Models Summer School. San Mateo, Calif.: Morgan Kaufmann Publishers. pp. 174-185.
Carpenter, Gail A., Grossberg, Stephen, & Rosen, David B. (1991) ART 2-A: An Adaptive Resonance algorithm for rapid category learning and recognition. Neural Networks, 4, 493-504.
Freud, Sigmund (1886-1899) Project for a Scientific Psychology. pp. 322-325. (1900) The Interpretation of Dreams. Introduction by James Strachey (Editor and translator). New York: Avon Books (1965).
http://en.wikipedia.org/wiki/K-means
Lecture Notes:
SESSION 4 (February 14) Dimensional analysis, competitive networks, phase plane analysis
Dynamics of on-center off-surround shunting competitive networks
Phase plane analysis of competitive networks
Readings:
Lin, C.C., & Segel, L.A. (1974) Mathematics Applied to Deterministic Problems in the Natural Sciences. New York: Macmillan.
1. Chapter 6: Simplification, dimensional analysis, and scaling, pp. 185-224
Edelstein-Keshet, Leah (1988) Mathematical Models in Biology. SIAM Classics in Applied Mathematics, vol. 46.
1. Section 4.3: Formulating a model
2. Section 4.4: Saturating nutrient consumption rate
3. Section 4.5: Dimensional analysis of the equations
4. Sections 5.2-5.9: Phase-plane methods and qualitative solutions, pp. 171-193
Boston University Ordinary Differential Equations Project: http://math.bu.edu/odes/.
1. Section 3.3: Phase planes for linear systems with real eigenvalues,-- pp. 266-282.
2. Section 5.2: Qualitative analysis, pp. 457-470.
3. Section 5.3: Hamiltonian systems, pp. 470-488.
4. Section 5.4: Dissipative systems, pp. 488-510.
http://en.wikipedia.org/wiki/Phase_plane
http://en.wikipedia.org/wiki/Dimensional_analysis
http://en.wikipedia.org/wiki/Hamiltonian_mechanics
http://en.wikipedia.org/wiki/Dissipative
Lecture Notes:
SESSION 5 (February22 Attention! Tuesday Class!) ARTMAP
Fuzzy ART:-- Generalized ART 1, for analog inputs, using the city-block metric (L1 norm)
Supervised learning by ART systems
Binary ARTMAP
Analog fuzzy ARTMAP
Readings:
Carpenter, Gail A., Grossberg, Stephen, & Rosen, David B. (1991) Fuzzy ART: Fast stable learning and categorization of analog patterns by an Adaptive Resonance system. Neural Networks, 4, 759-771.
Carpenter, Gail A., Grossberg, Stephen, & Reynolds, John H. (1991) ARTMAP: Supervised real-time learning and classification of nonstationary data by a self-organizing neural network. Neural Networks, 4, 565-588.
Carpenter, Gail A., Grossberg, Stephen, Markuzon, Natalya, Reynolds, John H., & Rosen, David B. (1992) Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3, 698-713.
Carpenter, Gail A. (2003). Default ARTMAP. Proceedings of the International Joint Conference on Neural Networks (IJCNN--03), Portland, Oregon, 1396-1401.
Frey, Peter W., & Slate, David J. (1991) Letter recognition using Holland--style adaptive classifiers. Machine Learning, 6, 161--182.
Zadeh, Lotfi A. (1965) Fuzzy sets. Information Control, 8, 338-353.
http://en.wikipedia.org/wiki/Fuzzy_sets
http://en.wikipedia.org/wiki/Fuzzy_logic
Lecture Notes:
SESSION 6 (February 28) Associative memory networks: Back propagation, multi-layer perceptrons, radial basis functions, cascade-correlation, higher-order networks
Back propagation
Multi-layer perceptrons
(Local) minimization of cost functions
Radial basis functions (RBFs)
Cascade-correlation architecture
Higher order networks
Readings:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
1. Sections 6.1-6.8: Multilayer neural networks, pp. 282-318.
2. Section 6.10.1: Radial basis function networks (RBFs), pp. 324-325.
3. Section 6.10.6: Cascade-correlation, pp. 329-330.
Fahlman, Scott E., & Lebiere, Christian (1990) The cascade-correlation learning architecture. In David S. Touretzky (Ed.) Neural Information Processing Systems 2, Proceedings of the NIPS Conference, Denver, 1989, San Mateo, Calif.: Morgan Kaufmann Publishers. pp. 524-532.
Giles, C. Lee, & Maxwell, Thomas (1987) Learning, invariance, and generalization in high-order neural networks. Applied Optics, 26, 4972-4978.
Lowe, David (2003) Radial basis function networks. In Arbib, Michael A. (2003) The Handbook of Brain Theory and Neural Networks, Second Edition. Cambridge, Mass.: MIT Press. pp. 937-940.
Moody, John, & Darken, Christian J. (1989) Fast learning in networks of locally-tuned processing units. Neural Computation, 1, 281-294.
Rosenblatt, F. (1958) The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386-408.
Rumelhart, David E., Hinton, Geoffrey E., & Williams, Ronald J. (1986) Learning internal representations by error propagation. In David E. Rumelhart & James L. McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition-- I. Cambridge, Mass.: MIT Press. pp. 318-362.
http://en.wikipedia.org/wiki/Cascade_correlation
Lecture Notes:
SESSION 7 (March 7) Support vector machines
Support vector machines (SVMs)
Constrained optimization
Lagrange multipliers
Readings:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
1. Section A.3: Lagrange optimization, p. 610.
2. Section 5.11: Support vector machines, pp. 259-265.
Vapnik, Vladimir N. (1998) Statistical Learning Theory. New York: John Wiley.
1. Section 9.5: Three theorems of optimization theory, pp. 390-394.
2. Chapter 10: The support vector method for estimating indicator functions, pp. 401 -- 441.
Strang, Gilbert (1988) Linear Algebra and its Applications, Third Edition. New York: Harcourt Brace Jovanovich College Publishers. Section 8.3: The theory of duality, pp. 412-423.
Bartlett, Peter L., & Maass, Wolfgang (2003) Vapnik-Chervonenkis dimension of neural nets. In Arbib, Michael A. (2003) The Handbook of Brain Theory and Neural Networks, Second Edition.-- Cambridge, Mass.: MIT Press. pp. 1188-1192
Bishop, Christopher M. (2006) Pattern Recognition and Machine Learning. Springer. Appendix E:-- Lagrange multipliers, pp. 707-710.
http://en.wikipedia.org/wiki/Optimization_%28mathematics%29
http://en.wikipedia.org/wiki/Dual_space
http://en.wikipedia.org/wiki/Constrained_Optimization_and_lagrange_Multipliers
http://en.wikipedia.org/wiki/Support_vector_machines
Lecture Notes:
SESSION 8 (March 21) Mid-term Exam
Phase plane assignment is due before the exam begins.
SESSION 9 (March 28) Physiology, psychology, and memory
Neural substrates of memory
Cortical organization
Neuropsychology of memory and amnesia
Neurobiology of chemical synapses, neuromodulators, and short-term synaptic plasticity
Synaptic modification
Retrograde messengers
Readings:
Bear, Mark F., Connors, Barry W., & Paradiso, Michael A. (1996) Neuroscience: Exploring the Brain. Baltimore: Williams & Wilkins. Chapter 19: Memory systems, pp. 514-545.
Levine, Daniel S. (2000) Introduction to Neural and Cognitive Modeling. Hillsdale, NJ: Lawrence Erlbaum Associates, 2nd Edition. Appendix 1: Basic Facts of Neurobiology, pp. 375-395.
Corkin, Suzanne (2002) What's new with the amnesic patient H.M.? Nature Reviews - Neuroscience, 3, 153-160
Freedman, David J., Riesenhuber, Maximilian, Poggio, Tomaso, & Miller, Earl K. (2003) A comparison of primate prefrontal and inferior temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235-5246.
Kandel, Eric R., Schwartz, James H., & Jessell, Thomas P. (Eds.) (2000) Principles of Neural Science. 4th Edition. New York: McGraw-Hill. Chapter 63: Eric R. Kandel. Cellular mechanisms of learning and the biological basis of individuality. pp. 1247-1279.
Kandel, Eric R., Schwartz, James H., & Jessell, Thomas P. (Eds.) (2000) Principles of Neural Science. 4th Edition. New York: McGraw-Hill, pp. 175-186. Chapter 10: Eric R. Kandel & Steven A. Siegelbaum. Overview of synaptic transmission.
Malenka, Robert C., & Nicoll, Roger A. (1999) Long-term potentiation - a decade of progress? Science, 285, 1870-1874.
Zucker, Robert S. (1989) Short-term synaptic plasticity. Annual Review of Neuroscience, 12, pp. 13-31.
Atkinson, Richard C., & Shiffrin, Richard M. (1971) The control of short-term memory. Scientific American, 82-90.
http://en.wikipedia.org/wiki/Synapse
http://en.wikipedia.org/wiki/Retrograde_signaling_in_LTP
http://en.wikipedia.org/wiki/Nitric_oxide
http://videocast.nih.gov/podcast.asp?13746
http://en.wikipedia.org/wiki/Long-term_potentiation
http://en.wikipedia.org/wiki/HM_%28patient%29
http://en.wikipedia.org/wiki/Neuron
http://en.wikipedia.org/wiki/Cerebral_cortex
http://en.wikipedia.org/wiki/Atkinson-Shiffrin_memory_model
Lecture Notes:
SESSION 10 (April 4) Decision Trees
Decision trees
Readings:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley.
1. Chapter 8: Nonmetric Methods, pp. 394-436.
Lecture Notes:
SESSION 11 (April 11) Liapunov Functions, Cohen-Grossberg Theorem, Hierarchical Temporal Memories
Liapunov functions and the LaSalle invariance principle
The Cohen-Grossberg theorem
Hierarchical temporal memories (guest lecture by John Agapiou)
Readings:
Cohen, M. and Grossberg, S. (1983). Absolute stability of global pattern formation and parallel memory storage by competitive neural networks. IEEE Transactions on Systems, Man, and Cybernetics, 13, pp. 815-826.
Grossberg, Stephen (1988) Nonlinear neural networks: Principles, mechanisms, and architectures. Neural Networks, 1, 17-61. Section 9 - Content-addressable memory storage: a general STM model and Liapunov method, pp. 24 - 30.
George, D. and Hawkins, J. (2009). Towards a mathematical theory of cortical micro-circuits. PLOS: Computational Biology, 5:10, e1000532.
Brauer, Fred, & Nohel, John (1969). The qualitative theory of differential equations. W.A. Benjamin. Sections 5.1 and 5.2.
Chris Bishop (2006). Pattern Recognition and Machine Learning. Section 8.4.
Lecture Notes:
Guest Lecture PDF (not here yet)
SESSION 12 (April 21 Attention! Thursday Class!) Boltzmann Machines; Genetic Algorithms
Readings:
Duda, Richard O., Hart, Peter E., & Stork, David (2001) Pattern Classification. Second Edition. New York: Wiley. Chapter 7: Stochastic Methods, pp. 350-393.
Lecture Notes:
SESSION 13 (April 25) Invariance, Integral Transforms, Moments
Invariant pattern recognition
Fourier analysis
Log-polar-Fourier filter
Algebraic invariance
Requirements for invariant pattern recognition system
Readings:
Cavanagh, Patrick (1984) Image transforms in the visual system. In Peter C. Dodwell & Terry Caelli (Eds.) Figural Synthesis. Hillsdale, NJ: Lawrence Erlbaum Associates. pp. 185-218.
Wood, Jeffrey. (1996) Invariant pattern recognition: A review. Pattern Recognition, 29(1), 1-17.
http://en.wikipedia.org/wiki/Fourier_transform
http://en.wikipedia.org/wiki/Complex_logarithm
http://en.wikipedia.org/wiki/Image_moment
http://en.wikipedia.org/wiki/Zernike_polynomials
Lecture Notes:
SESSION 14 (May 2) Class Project Student Presentations.
FINAL EXAM Date: May 10th at 5PM