https://en.wikipedia.org/w/index.php?action=history&feed=atom&title=Generalized_Hebbian_algorithm Generalized Hebbian algorithm - Revision history 2025-06-04T16:53:39Z Revision history for this page on the wiki MediaWiki 1.45.0-wmf.3 https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1292681887&oldid=prev OAbot: Open access bot: url-access updated in citation with #oabot. 2025-05-28T07:20:08Z <p><a href="/wiki/Wikipedia:OABOT" class="mw-redirect" title="Wikipedia:OABOT">Open access bot</a>: url-access updated in citation with #oabot.</p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 07:20, 28 May 2025</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 49:</td> <td colspan="2" class="diff-lineno">Line 49:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |last1=Olshausen |first1=Bruno A. |last2=Field |first2=David J. |date=June 1996 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |pmid=8637596 |issn=1476-4687}}&lt;/ref&gt; performed the generalized Hebbian algorithm on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by principal components analysis, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |last1=Olshausen |first1=Bruno A. |last2=Field |first2=David J. |date=June 1996 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |pmid=8637596 |issn=1476-4687<ins style="font-weight: bold; text-decoration: none;">|url-access=subscription </ins>}}&lt;/ref&gt; performed the generalized Hebbian algorithm on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by principal components analysis, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> </tr> </table> OAbot https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1262710931&oldid=prev Citation bot: Add: pmid, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Artificial neural networks | #UCB_Category 143/155 2024-12-12T20:05:35Z <p>Add: pmid, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | <a href="/wiki/Wikipedia:UCB" class="mw-redirect" title="Wikipedia:UCB">Use this bot</a>. <a href="/wiki/Wikipedia:DBUG" class="mw-redirect" title="Wikipedia:DBUG">Report bugs</a>. | Suggested by Dominic3203 | <a href="/wiki/Category:Artificial_neural_networks" title="Category:Artificial neural networks">Category:Artificial neural networks</a> | #UCB_Category 143/155</p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 20:05, 12 December 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 49:</td> <td colspan="2" class="diff-lineno">Line 49:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |<del style="font-weight: bold; text-decoration: none;">last</del>=Olshausen |<del style="font-weight: bold; text-decoration: none;">first</del>=Bruno A. |last2=Field |first2=David J. |date=June 1996 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |issn=1476-4687}}&lt;/ref&gt; performed the generalized Hebbian algorithm on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by principal components analysis, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |<ins style="font-weight: bold; text-decoration: none;">last1</ins>=Olshausen |<ins style="font-weight: bold; text-decoration: none;">first1</ins>=Bruno A. |last2=Field |first2=David J. |date=June 1996 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0<ins style="font-weight: bold; text-decoration: none;"> |pmid=8637596</ins> |issn=1476-4687}}&lt;/ref&gt; performed the generalized Hebbian algorithm on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by principal components analysis, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> </tr> </table> Citation bot https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258493589&oldid=prev Ira Leviton: Fixed another date. 2024-11-20T01:31:57Z <p>Fixed another date.</p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 01:31, 20 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 49:</td> <td colspan="2" class="diff-lineno">Line 49:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |last=Olshausen |first=Bruno A. |last2=Field |first2=David J. |date=1996<del style="font-weight: bold; text-decoration: none;">-06</del> |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |issn=1476-4687}}&lt;/ref&gt; performed the generalized Hebbian algorithm on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by principal components analysis, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |last=Olshausen |first=Bruno A. |last2=Field |first2=David J. |date=<ins style="font-weight: bold; text-decoration: none;">June </ins>1996 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |issn=1476-4687}}&lt;/ref&gt; performed the generalized Hebbian algorithm on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by principal components analysis, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> </tr> <!-- diff cache key enwiki:diff:1.41:old-1258492776:rev-1258493589:wikidiff2=table:1.14.1:ff290eae --> </table> Ira Leviton https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258492776&oldid=prev Ira Leviton: Fixed a reference and jargon. Please see Category:CS1 errors: dates. 2024-11-20T01:25:26Z <p>Fixed a reference and jargon. Please see <a href="/wiki/Category:CS1_errors:_dates" title="Category:CS1 errors: dates">Category:CS1 errors: dates</a>.</p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 01:25, 20 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 1:</td> <td colspan="2" class="diff-lineno">Line 1:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>{{short description|Linear feedforward neural network model}}</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>{{short description|Linear feedforward neural network model}}</div></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The '''generalized Hebbian algorithm'''<del style="font-weight: bold; text-decoration: none;"> ('''GHA''')</del>, also known in the literature as '''Sanger's rule''', is a linear [[feedforward neural network]] for [[unsupervised learning]] with applications primarily in [[principal components analysis]]. First defined in 1989,&lt;ref name="Sanger89"&gt;{{cite journal |last=Sanger |first=Terence D. |author-link=Terence Sanger |year=1989 |title= Optimal unsupervised learning in a single-layer linear feedforward neural network |journal=Neural Networks |volume=2 |issue=6 |pages=459–473 |url=http://courses.cs.washington.edu/courses/cse528/09sp/sanger_pca_nn.pdf |access-date= 2007-11-24 |doi= 10.1016/0893-6080(89)90044-0 |citeseerx=10.1.1.128.6893 }}&lt;/ref&gt; it is similar to [[Oja's rule]] in its formulation and stability, except it can be applied to networks with multiple outputs. The name originates because of the similarity between the algorithm and a hypothesis made by [[Donald Hebb]]&lt;ref name="Hebb 1949"&gt;{{Cite book|last=Hebb|first=D.O.|author-link=Donald Olding Hebb|title=The Organization of Behavior|publisher=Wiley &amp; Sons|location=New York|year=1949|isbn=9781135631918|url=https://books.google.com/books?id=uyV5AgAAQBAJ}}&lt;/ref&gt; about the way in which synaptic strengths in the brain are modified in response to experience, i.e., that changes are proportional to the correlation between the firing of pre- and post-synaptic [[neurons]].&lt;ref name="Hertz, Krough, and Palmer, 1991"&gt;{{Cite book|last=Hertz|first=John|author2=Anders Krough |author3=Richard G. Palmer |title=Introduction to the Theory of Neural Computation|publisher=Addison-Wesley Publishing Company|location=Redwood City, CA|year=1991|isbn=978-0201515602}}&lt;/ref&gt;</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The '''generalized Hebbian algorithm''', also known in the literature as '''Sanger's rule''', is a linear [[feedforward neural network]] for [[unsupervised learning]] with applications primarily in [[principal components analysis]]. First defined in 1989,&lt;ref name="Sanger89"&gt;{{cite journal |last=Sanger |first=Terence D. |author-link=Terence Sanger |year=1989 |title= Optimal unsupervised learning in a single-layer linear feedforward neural network |journal=Neural Networks |volume=2 |issue=6 |pages=459–473 |url=http://courses.cs.washington.edu/courses/cse528/09sp/sanger_pca_nn.pdf |access-date= 2007-11-24 |doi= 10.1016/0893-6080(89)90044-0 |citeseerx=10.1.1.128.6893 }}&lt;/ref&gt; it is similar to [[Oja's rule]] in its formulation and stability, except it can be applied to networks with multiple outputs. The name originates because of the similarity between the algorithm and a hypothesis made by [[Donald Hebb]]&lt;ref name="Hebb 1949"&gt;{{Cite book|last=Hebb|first=D.O.|author-link=Donald Olding Hebb|title=The Organization of Behavior|publisher=Wiley &amp; Sons|location=New York|year=1949|isbn=9781135631918|url=https://books.google.com/books?id=uyV5AgAAQBAJ}}&lt;/ref&gt; about the way in which synaptic strengths in the brain are modified in response to experience, i.e., that changes are proportional to the correlation between the firing of pre- and post-synaptic [[neurons]].&lt;ref name="Hertz, Krough, and Palmer, 1991"&gt;{{Cite book|last=Hertz|first=John|author2=Anders Krough |author3=Richard G. Palmer |title=Introduction to the Theory of Neural Computation|publisher=Addison-Wesley Publishing Company|location=Redwood City, CA|year=1991|isbn=978-0201515602}}&lt;/ref&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Theory==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Theory==</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a problem of learning a linear code for some data. Each data is a multi-dimensional vector &lt;math&gt;x \in \R^n&lt;/math&gt;, and can be (approximately) represented as a linear sum of linear code vectors &lt;math&gt;w_1, \dots, w_m&lt;/math&gt;. When &lt;math&gt;m = n&lt;/math&gt;, it is possible to exactly represent the data. If &lt;math&gt;m &lt; n&lt;/math&gt;, it is possible to approximately represent the data. To minimize the L2 loss of representation, &lt;math&gt;w_1, \dots, w_m&lt;/math&gt; should be the highest principal component vectors.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a problem of learning a linear code for some data. Each data is a multi-dimensional vector &lt;math&gt;x \in \R^n&lt;/math&gt;, and can be (approximately) represented as a linear sum of linear code vectors &lt;math&gt;w_1, \dots, w_m&lt;/math&gt;. When &lt;math&gt;m = n&lt;/math&gt;, it is possible to exactly represent the data. If &lt;math&gt;m &lt; n&lt;/math&gt;, it is possible to approximately represent the data. To minimize the L2 loss of representation, &lt;math&gt;w_1, \dots, w_m&lt;/math&gt; should be the highest principal component vectors.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The <del style="font-weight: bold; text-decoration: none;">GHA</del> is an iterative algorithm to find the highest principal component vectors, in an algorithmic form that resembles [[Unsupervised learning|unsupervised]] Hebbian learning in neural networks.</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The <ins style="font-weight: bold; text-decoration: none;">generalized Hebbian algorithm</ins> is an iterative algorithm to find the highest principal component vectors, in an algorithmic form that resembles [[Unsupervised learning|unsupervised]] Hebbian learning in neural networks.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a one-layered neural network with &lt;math&gt;n&lt;/math&gt; input neurons and &lt;math&gt;m&lt;/math&gt; output neurons &lt;math&gt;y_1, \dots, y_m&lt;/math&gt;. The linear code vectors are the connection strengths, that is, &lt;math&gt;w_{ij}&lt;/math&gt; is the [[synaptic weight]] or connection strength between the &lt;math&gt;j&lt;/math&gt;-th input and &lt;math&gt;i&lt;/math&gt;-th output neurons.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a one-layered neural network with &lt;math&gt;n&lt;/math&gt; input neurons and &lt;math&gt;m&lt;/math&gt; output neurons &lt;math&gt;y_1, \dots, y_m&lt;/math&gt;. The linear code vectors are the connection strengths, that is, &lt;math&gt;w_{ij}&lt;/math&gt; is the [[synaptic weight]] or connection strength between the &lt;math&gt;j&lt;/math&gt;-th input and &lt;math&gt;i&lt;/math&gt;-th output neurons.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The generalized Hebbian algorithm learning rule is of the form</div></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The GHA learning rule is of the form&lt;ref&gt;{{Citation |last=Gorrell |first=Genevieve |title=Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing. |journal=EACL |year=2006 |citeseerx=10.1.1.102.2084}}&lt;/ref&gt;</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>:&lt;math&gt;\,\Delta w_{ij} ~ = ~ \eta y_i \left(x_j - \sum_{k=1}^{i} w_{kj} y_k \right)&lt;/math&gt;</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>:&lt;math&gt;\,\Delta w_{ij} ~ = ~ \eta y_i \left(x_j - \sum_{k=1}^{i} w_{kj} y_k \right)&lt;/math&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.<ins style="font-weight: bold; text-decoration: none;">&lt;ref&gt;{{Citation |last=Gorrell |first=Genevieve |title=Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing. |journal=EACL |year=2006 |citeseerx=10.1.1.102.2084}}&lt;/ref&gt;</ins></div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> </tr> <tr> <td colspan="2" class="diff-lineno">Line 30:</td> <td colspan="2" class="diff-lineno">Line 30:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>where the function {{math|LT}} sets all matrix elements above the diagonal equal to 0, and note that our output {{math|'''y'''(''t'') {{=}} ''w''(''t'') '''x'''(''t'')}} is a linear neuron.&lt;ref name="Sanger89"/&gt;</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>where the function {{math|LT}} sets all matrix elements above the diagonal equal to 0, and note that our output {{math|'''y'''(''t'') {{=}} ''w''(''t'') '''x'''(''t'')}} is a linear neuron.&lt;ref name="Sanger89"/&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>===Stability and <del style="font-weight: bold; text-decoration: none;">PCA</del>===</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>===Stability and <ins style="font-weight: bold; text-decoration: none;">Principal Components Analysis</ins>===</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>&lt;ref name="Haykin98"&gt;{{cite book |last=Haykin |first=Simon |author-link=Simon Haykin |title=Neural Networks: A Comprehensive Foundation |edition=2 |year=1998 |publisher=Prentice Hall |isbn=978-0-13-273350-2 }}&lt;/ref&gt;</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>&lt;ref name="Haykin98"&gt;{{cite book |last=Haykin |first=Simon |author-link=Simon Haykin |title=Neural Networks: A Comprehensive Foundation |edition=2 |year=1998 |publisher=Prentice Hall |isbn=978-0-13-273350-2 }}&lt;/ref&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.&lt;ref name="Oja82"&gt;{{cite journal |last=Oja |first=Erkki |author-link=Erkki Oja |date=November 1982 |title=Simplified neuron model as a principal component analyzer |journal=Journal of Mathematical Biology |volume=15 |issue=3 |pages=267–273 |doi=10.1007/BF00275687 |pmid=7153672 |s2cid=16577977 |id=BF00275687}}&lt;/ref&gt; One can think of the <del style="font-weight: bold; text-decoration: none;">GHA</del> as iterating Oja's rule. </div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.&lt;ref name="Oja82"&gt;{{cite journal |last=Oja |first=Erkki |author-link=Erkki Oja |date=November 1982 |title=Simplified neuron model as a principal component analyzer |journal=Journal of Mathematical Biology |volume=15 |issue=3 |pages=267–273 |doi=10.1007/BF00275687 |pmid=7153672 |s2cid=16577977 |id=BF00275687}}&lt;/ref&gt; One can think of the <ins style="font-weight: bold; text-decoration: none;">generalized Hebbian algorithm</ins> as iterating Oja's rule. </div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an [[autoencoder]], with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an [[autoencoder]], with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> </tr> <tr> <td colspan="2" class="diff-lineno">Line 42:</td> <td colspan="2" class="diff-lineno">Line 42:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Applications==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Applications==</div></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The <del style="font-weight: bold; text-decoration: none;">GHA</del> is used in applications where a [[self-organizing map]] is necessary, or where a feature or [[principal components analysis]] can be used. Examples of such cases include [[artificial intelligence]] and speech and image processing.</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The <ins style="font-weight: bold; text-decoration: none;">generalized Hebbian algorithm</ins> is used in applications where a [[self-organizing map]] is necessary, or where a feature or [[principal components analysis]] can be used. Examples of such cases include [[artificial intelligence]] and speech and image processing.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Its importance comes from the fact that learning is a single-layer process—that is, a synaptic weight changes only depending on the response of the inputs and outputs of that layer, thus avoiding the multi-layer dependence associated with the [[backpropagation]] algorithm. It also has a simple and predictable trade-off between learning speed and accuracy of convergence as set by the [[learning]] rate parameter {{math|η}}.&lt;ref name="Haykin98"/&gt;</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Its importance comes from the fact that learning is a single-layer process—that is, a synaptic weight changes only depending on the response of the inputs and outputs of that layer, thus avoiding the multi-layer dependence associated with the [[backpropagation]] algorithm. It also has a simple and predictable trade-off between learning speed and accuracy of convergence as set by the [[learning]] rate parameter {{math|η}}.&lt;ref name="Haykin98"/&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>[[File:Generalized Hebbian algorithm on 8-by-8 patches of Caltech101.png|thumb|Features learned by <del style="font-weight: bold; text-decoration: none;">GHA</del> running on 8-by-8 patches of [[Caltech 101]].]]</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>[[File:Generalized Hebbian algorithm on 8-by-8 patches of Caltech101.png|thumb|Features learned by <ins style="font-weight: bold; text-decoration: none;">generalized Hebbian algorithm</ins> running on 8-by-8 patches of [[Caltech 101]].]]</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[File:Principal component analysis of Caltech101.png|thumb|Features found by [[Principal Component Analysis]] on the same Caltech 101 dataset.]]</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |last=Olshausen |first=Bruno A. |last2=Field |first2=David J. |date=1996-06 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |issn=1476-4687}}&lt;/ref&gt; performed <del style="font-weight: bold; text-decoration: none;">GHA</del> on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by <del style="font-weight: bold; text-decoration: none;">PCA</del>, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>As an example, (Olshausen and Field, 1996)&lt;ref&gt;{{Cite journal |last=Olshausen |first=Bruno A. |last2=Field |first2=David J. |date=1996-06 |title=Emergence of simple-cell receptive field properties by learning a sparse code for natural images |url=https://www.nature.com/articles/381607a0 |journal=Nature |language=en |volume=381 |issue=6583 |pages=607–609 |doi=10.1038/381607a0 |issn=1476-4687}}&lt;/ref&gt; performed <ins style="font-weight: bold; text-decoration: none;">the generalized Hebbian algorithm</ins> on 8-by-8 patches of photos of natural scenes, and found that it results in Fourier-like features. The features are the same as the principal components found by <ins style="font-weight: bold; text-decoration: none;">principal components analysis</ins>, as expected, and that, the features are determined by the &lt;math&gt;64\times 64&lt;/math&gt; variance matrix of the samples of 8-by-8 patches. In other words, it is determined by the second-order statistics of the pixels in images. They criticized this as insufficient to capture higher-order statistics which are necessary to explain the Gabor-like features of simple cells in the [[primary visual cortex]].</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==See also==</div></td> </tr> </table> Ira Leviton https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258355677&oldid=prev Cosmia Nebula: /* Stability and PCA */ 2024-11-19T06:35:41Z <p><span class="autocomment">Stability and PCA</span></p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 06:35, 19 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 37:</td> <td colspan="2" class="diff-lineno">Line 37:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an [[autoencoder]], with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an [[autoencoder]], with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>When &lt;math&gt;m = 2 &lt;/math&gt;, the first neuron in the hidden layer of the autoencoder still learns as described, since it is unaffected by the<del style="font-weight: bold; text-decoration: none;"> other</del> second neuron. So, after the first neuron and its vector &lt;math&gt;w_1&lt;/math&gt; has converged, the second neuron is effectively running another Oja's rule on the modified input vectors, defined by &lt;math&gt;x' = x - y_1w_1 &lt;/math&gt;, which we know is the input vector with the first principal component removed. Therefore, the second neuron learns to code for the second principal component. </div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>When &lt;math&gt;m = 2 &lt;/math&gt;, the first neuron in the hidden layer of the autoencoder still learns as described, since it is unaffected by the second neuron. So, after the first neuron and its vector &lt;math&gt;w_1&lt;/math&gt; has converged, the second neuron is effectively running another Oja's rule on the modified input vectors, defined by &lt;math&gt;x' = x - y_1w_1 &lt;/math&gt;, which we know is the input vector with the first principal component removed. Therefore, the second neuron learns to code for the second principal component. </div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>By induction, this results in finding the top-&lt;math&gt;m &lt;/math&gt; principal components for arbitrary &lt;math&gt;m &lt;/math&gt;.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>By induction, this results in finding the top-&lt;math&gt;m &lt;/math&gt; principal components for arbitrary &lt;math&gt;m &lt;/math&gt;.</div></td> </tr> </table> Cosmia Nebula https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258355626&oldid=prev Cosmia Nebula: /* Stability and PCA */ 2024-11-19T06:35:10Z <p><span class="autocomment">Stability and PCA</span></p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 06:35, 19 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 35:</td> <td colspan="2" class="diff-lineno">Line 35:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.&lt;ref name="Oja82"&gt;{{cite journal |last=Oja |first=Erkki |author-link=Erkki Oja |date=November 1982 |title=Simplified neuron model as a principal component analyzer |journal=Journal of Mathematical Biology |volume=15 |issue=3 |pages=267–273 |doi=10.1007/BF00275687 |pmid=7153672 |s2cid=16577977 |id=BF00275687}}&lt;/ref&gt; One can think of the GHA as iterating Oja's rule. </div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.&lt;ref name="Oja82"&gt;{{cite journal |last=Oja |first=Erkki |author-link=Erkki Oja |date=November 1982 |title=Simplified neuron model as a principal component analyzer |journal=Journal of Mathematical Biology |volume=15 |issue=3 |pages=267–273 |doi=10.1007/BF00275687 |pmid=7153672 |s2cid=16577977 |id=BF00275687}}&lt;/ref&gt; One can think of the GHA as iterating Oja's rule. </div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an autoencoder, with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an <ins style="font-weight: bold; text-decoration: none;">[[</ins>autoencoder<ins style="font-weight: bold; text-decoration: none;">]]</ins>, with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>When &lt;math&gt;m = 2 &lt;/math&gt;, the first neuron in the hidden layer of the autoencoder still learns as described, since it is unaffected by the other second neuron. So, after the first neuron and its vector &lt;math&gt;w_1&lt;/math&gt; has converged, the second neuron is effectively running another Oja's rule on the modified input vectors, defined by &lt;math&gt;x' = x - y_1w_1 &lt;/math&gt;, which we know is the input vector with the first principal component removed. Therefore, the second neuron learns to code for the second principal component. </div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>When &lt;math&gt;m = 2 &lt;/math&gt;, the first neuron in the hidden layer of the autoencoder still learns as described, since it is unaffected by the other second neuron. So, after the first neuron and its vector &lt;math&gt;w_1&lt;/math&gt; has converged, the second neuron is effectively running another Oja's rule on the modified input vectors, defined by &lt;math&gt;x' = x - y_1w_1 &lt;/math&gt;, which we know is the input vector with the first principal component removed. Therefore, the second neuron learns to code for the second principal component. </div></td> </tr> </table> Cosmia Nebula https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258355583&oldid=prev Cosmia Nebula: /* Theory */ induction 2024-11-19T06:34:45Z <p><span class="autocomment">Theory: </span> induction</p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 06:34, 19 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 11:</td> <td colspan="2" class="diff-lineno">Line 11:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The GHA learning rule is of the form&lt;ref&gt;{{Citation |last=Gorrell |first=Genevieve |title=Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing. |journal=EACL |year=2006 |citeseerx=10.1.1.102.2084}}&lt;/ref&gt;</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>The GHA learning rule is of the form&lt;ref&gt;{{Citation |last=Gorrell |first=Genevieve |title=Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing. |journal=EACL |year=2006 |citeseerx=10.1.1.102.2084}}&lt;/ref&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>:&lt;math&gt;\,\Delta w_{ij} ~ = ~ \eta\left(<del style="font-weight: bold; text-decoration: none;">y_i </del>x_j -<del style="font-weight: bold; text-decoration: none;"> y_i</del> \sum_{k=1}^{i} w_{kj} y_k \right)&lt;/math&gt;</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>:&lt;math&gt;\,\Delta w_{ij} ~ = ~ \eta<ins style="font-weight: bold; text-decoration: none;"> y_i </ins>\left(x_j - \sum_{k=1}^{i} w_{kj} y_k \right)&lt;/math&gt;</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.</div></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><br /></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> </tr> <tr> <td colspan="2" class="diff-lineno">Line 34:</td> <td colspan="2" class="diff-lineno">Line 32:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Stability and PCA===</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Stability and PCA===</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>&lt;ref name="Haykin98"&gt;{{cite book |last=Haykin |first=Simon |author-link=Simon Haykin |title=Neural Networks: A Comprehensive Foundation |edition=2 |year=1998 |publisher=Prentice Hall |isbn=978-0-13-273350-2 }}&lt;/ref&gt;</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>&lt;ref name="Haykin98"&gt;{{cite book |last=Haykin |first=Simon |author-link=Simon Haykin |title=Neural Networks: A Comprehensive Foundation |edition=2 |year=1998 |publisher=Prentice Hall |isbn=978-0-13-273350-2 }}&lt;/ref&gt;</div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>&lt;ref name="Oja82"&gt;{{cite journal |last=Oja |first=Erkki |author-link=Erkki Oja |date=November 1982 |title=Simplified neuron model as a principal component analyzer |journal=Journal of Mathematical Biology |volume=15 |issue=3 |pages=267–273<del style="font-weight: bold; text-decoration: none;"> |id=BF00275687</del> |doi=10.1007/BF00275687 |pmid=7153672 |s2cid=16577977 }}&lt;/ref&gt;</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><ins style="font-weight: bold; text-decoration: none;">[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.</ins>&lt;ref name="Oja82"&gt;{{cite journal |last=Oja |first=Erkki |author-link=Erkki Oja |date=November 1982 |title=Simplified neuron model as a principal component analyzer |journal=Journal of Mathematical Biology |volume=15 |issue=3 |pages=267–273 |doi=10.1007/BF00275687 |pmid=7153672 |s2cid=16577977 <ins style="font-weight: bold; text-decoration: none;">|id=BF00275687</ins>}}&lt;/ref&gt;<ins style="font-weight: bold; text-decoration: none;"> One can think of the GHA as iterating Oja's rule. </ins></div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>With Oja's rule, &lt;math&gt;w_1&lt;/math&gt; is learned, and it has the same direction as the largest principal component vector is learned, with length determined by &lt;math&gt;E[x_j] = E[w_{1j} y_1]&lt;/math&gt; for all &lt;math&gt;j &lt;/math&gt;, where the expectation is taken over all input-output pairs. In other words, the length of the vector &lt;math&gt;w_1&lt;/math&gt; is such that we have an autoencoder, with the latent code &lt;math&gt;y_1 = \sum_i w_{1i} x_i &lt;/math&gt;, such that &lt;math&gt;E[\| x - y_1 w_1 \|^2] &lt;/math&gt; is minimized.</div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>When &lt;math&gt;m = 2 &lt;/math&gt;, the first neuron in the hidden layer of the autoencoder still learns as described, since it is unaffected by the other second neuron. So, after the first neuron and its vector &lt;math&gt;w_1&lt;/math&gt; has converged, the second neuron is effectively running another Oja's rule on the modified input vectors, defined by &lt;math&gt;x' = x - y_1w_1 &lt;/math&gt;, which we know is the input vector with the first principal component removed. Therefore, the second neuron learns to code for the second principal component. </div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>By induction, this results in finding the top-&lt;math&gt;m &lt;/math&gt; principal components for arbitrary &lt;math&gt;m &lt;/math&gt;.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Applications==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Applications==</div></td> </tr> </table> Cosmia Nebula https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258231777&oldid=prev Cosmia Nebula: /* Theory */ 2024-11-18T21:22:22Z <p><span class="autocomment">Theory</span></p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 21:22, 18 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 5:</td> <td colspan="2" class="diff-lineno">Line 5:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a problem of learning a linear code for some data. Each data is a multi-dimensional vector &lt;math&gt;x \in \R^n&lt;/math&gt;, and can be (approximately) represented as a linear sum of linear code vectors &lt;math&gt;w_1, \dots, w_m&lt;/math&gt;. When &lt;math&gt;m = n&lt;/math&gt;, it is possible to exactly represent the data. If &lt;math&gt;m &lt; n&lt;/math&gt;, it is possible to approximately represent the data. To minimize the L2 loss of representation, &lt;math&gt;w_1, \dots, w_m&lt;/math&gt; should be the highest principal component vectors.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a problem of learning a linear code for some data. Each data is a multi-dimensional vector &lt;math&gt;x \in \R^n&lt;/math&gt;, and can be (approximately) represented as a linear sum of linear code vectors &lt;math&gt;w_1, \dots, w_m&lt;/math&gt;. When &lt;math&gt;m = n&lt;/math&gt;, it is possible to exactly represent the data. If &lt;math&gt;m &lt; n&lt;/math&gt;, it is possible to approximately represent the data. To minimize the L2 loss of representation, &lt;math&gt;w_1, \dots, w_m&lt;/math&gt; should be the highest principal component vectors.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The GHA is an iterative <del style="font-weight: bold; text-decoration: none;">method</del> to <del style="font-weight: bold; text-decoration: none;">learn</del> the highest principal component vectors in <del style="font-weight: bold; text-decoration: none;">a</del> form that resembles [[Unsupervised learning|unsupervised]] Hebbian learning in neural networks.</div></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The GHA is an iterative <ins style="font-weight: bold; text-decoration: none;">algorithm</ins> to <ins style="font-weight: bold; text-decoration: none;">find</ins> the highest principal component vectors<ins style="font-weight: bold; text-decoration: none;">,</ins> in <ins style="font-weight: bold; text-decoration: none;">an algorithmic</ins> form that resembles [[Unsupervised learning|unsupervised]] Hebbian learning in neural networks.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a one-layered neural network with &lt;math&gt;n&lt;/math&gt; input neurons and &lt;math&gt;m&lt;/math&gt; output neurons &lt;math&gt;y_1, \dots, y_m&lt;/math&gt;. The linear code vectors are the connection strengths, that is, &lt;math&gt;w_{ij}&lt;/math&gt; is the [[synaptic weight]] or connection strength between the &lt;math&gt;j&lt;/math&gt;-th input and &lt;math&gt;i&lt;/math&gt;-th output neurons.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>Consider a one-layered neural network with &lt;math&gt;n&lt;/math&gt; input neurons and &lt;math&gt;m&lt;/math&gt; output neurons &lt;math&gt;y_1, \dots, y_m&lt;/math&gt;. The linear code vectors are the connection strengths, that is, &lt;math&gt;w_{ij}&lt;/math&gt; is the [[synaptic weight]] or connection strength between the &lt;math&gt;j&lt;/math&gt;-th input and &lt;math&gt;i&lt;/math&gt;-th output neurons.</div></td> </tr> </table> Cosmia Nebula https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258231526&oldid=prev Cosmia Nebula: /* Theory */ Oja's rule 2024-11-18T21:20:45Z <p><span class="autocomment">Theory: </span> Oja&#039;s rule</p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 21:20, 18 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 14:</td> <td colspan="2" class="diff-lineno">Line 14:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.</div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>[[Oja's rule]] is the special case where &lt;math&gt;m = 1&lt;/math&gt;.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> </tr> </table> Cosmia Nebula https://en.wikipedia.org/w/index.php?title=Generalized_Hebbian_algorithm&diff=1258231367&oldid=prev Cosmia Nebula: /* Theory */ 2024-11-18T21:19:48Z <p><span class="autocomment">Theory</span></p> <table style="background-color: #fff; color: #202122;" data-mw="interface"> <col class="diff-marker" /> <col class="diff-content" /> <col class="diff-marker" /> <col class="diff-content" /> <tr class="diff-title" lang="en"> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Previous revision</td> <td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 21:19, 18 November 2024</td> </tr><tr> <td colspan="2" class="diff-lineno">Line 3:</td> <td colspan="2" class="diff-lineno">Line 3:</td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Theory==</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>==Theory==</div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>Consider a problem of learning a linear code for some data. Each data is a multi-dimensional vector &lt;math&gt;x \in \R^n&lt;/math&gt;, and can be (approximately) represented as a linear sum of linear code vectors &lt;math&gt;w_1, \dots, w_m&lt;/math&gt;. When &lt;math&gt;m = n&lt;/math&gt;, it is possible to exactly represent the data. If &lt;math&gt;m &lt; n&lt;/math&gt;, it is possible to approximately represent the data. To minimize the L2 loss of representation, &lt;math&gt;w_1, \dots, w_m&lt;/math&gt; should be the highest principal component vectors.</div></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>The GHA combines Oja's rule with the [[Gram-Schmidt process]] to produce a learning rule of the form</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>The GHA is an iterative method to learn the highest principal component vectors in a form that resembles [[Unsupervised learning|unsupervised]] Hebbian learning in neural networks.</div></td> </tr> <tr> <td class="diff-marker"><a class="mw-diff-movedpara-left" title="Paragraph was moved. Click to jump to new location." href="#movedpara_10_3_rhs">&#x26AB;</a></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div><a name="movedpara_5_0_lhs"></a>:&lt;math&gt;\,\Delta w_{ij} ~ = ~ \eta\left(y_i x_j - y_i \sum_{k=1}^{i} w_{kj} y_k \right)&lt;/math&gt;<del style="font-weight: bold; text-decoration: none;">,&lt;ref&gt;{{Citation</del></div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> | last = Gorrell</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> | first =Genevieve</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker"><a class="mw-diff-movedpara-left" title="Paragraph was moved. Click to jump to new location." href="#movedpara_10_1_rhs">&#x26AB;</a></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div><a name="movedpara_6_2_lhs"></a><del style="font-weight: bold; text-decoration: none;"> |</del> <del style="font-weight: bold; text-decoration: none;">title</del> = Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing.</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> | journal = EACL</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> | year = 2006</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> | citeseerx =10.1.1.102.2084</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> }}&lt;/ref&gt;</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>Consider a one-layered neural network with &lt;math&gt;n&lt;/math&gt; input neurons and &lt;math&gt;m&lt;/math&gt; output neurons &lt;math&gt;y_1, \dots, y_m&lt;/math&gt;. The linear code vectors are the connection strengths, that is, &lt;math&gt;w_{ij}&lt;/math&gt; is the [[synaptic weight]] or connection strength between the &lt;math&gt;j&lt;/math&gt;-th input and &lt;math&gt;i&lt;/math&gt;-th output neurons.</div></td> </tr> <tr> <td class="diff-marker" data-marker="−"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>where {{math|''w''&lt;sub&gt;''ij''&lt;/sub&gt;}} defines the [[synaptic weight]] or connection strength between the {{math|''j''}}th input and {{math|''i''}}th output neurons, {{math|''x''}} and {{math|''y''}} are the input and output vectors, respectively, and {{math|''η''}} is the ''[[learning rate]]'' parameter.</div></td> <td colspan="2" class="diff-empty diff-side-added"></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker"><a class="mw-diff-movedpara-right" title="Paragraph was moved. Click to jump to old location." href="#movedpara_6_2_lhs">&#x26AB;</a></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><a name="movedpara_10_1_rhs"></a><ins style="font-weight: bold; text-decoration: none;">The</ins> <ins style="font-weight: bold; text-decoration: none;">GHA</ins> <ins style="font-weight: bold; text-decoration: none;">learning rule is of the form&lt;ref&gt;{{Citation |last</ins>=<ins style="font-weight: bold; text-decoration: none;">Gorrell</ins> <ins style="font-weight: bold; text-decoration: none;">|first=Genevieve |title=</ins>Generalized Hebbian Algorithm for Incremental Singular Value Decomposition in Natural Language Processing.<ins style="font-weight: bold; text-decoration: none;"> |journal=EACL |year=2006 |citeseerx=10.1.1.102.2084}}&lt;/ref&gt;</ins></div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker"><a class="mw-diff-movedpara-right" title="Paragraph was moved. Click to jump to old location." href="#movedpara_5_0_lhs">&#x26AB;</a></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div><a name="movedpara_10_3_rhs"></a>:&lt;math&gt;\,\Delta w_{ij} ~ = ~ \eta\left(y_i x_j - y_i \sum_{k=1}^{i} w_{kj} y_k \right)&lt;/math&gt;</div></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td colspan="2" class="diff-empty diff-side-deleted"></td> <td class="diff-marker" data-marker="+"></td> <td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>where &lt;math&gt;\eta&lt;/math&gt; is the ''[[learning rate]]'' parameter.</div></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br /></td> </tr> <tr> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> <td class="diff-marker"></td> <td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>===Derivation===</div></td> </tr> </table> Cosmia Nebula