Wikipedia talk:Articles for creation/Zemor's Decoding Algorithm and Zemor's decoding algorithm: Difference between pages

Content deleted Content added

VisualWikitext

@@ Line 1: / Line 1: @@
-#REDIRECT [[Zemor's decoding algorithm]]
+{{Short description|Coding theory algorithm}}
+In [[coding theory]], '''Zemor's algorithm''', designed and developed by Gilles Zemor,<ref name="vg1">{{Cite web|url=https://www.math.u-bordeaux.fr/~gzemor/|title=Gilles Zémor|website=www.math.u-bordeaux.fr|accessdate=9 April 2023}}</ref> is a recursive low-complexity approach to code construction.  It is an improvement over the algorithm of [[Michael Sipser|Sipser]] and [[Daniel Spielman|Spielman]].
-{{R from move}}
+Zemor considered a typical class of Sipser–Spielman construction of [[expander code]]s, where the underlying graph is [[bipartite graph]]. Sipser and Spielman introduced a constructive family of asymptotically good linear-error codes together with a simple parallel algorithm that will always remove a constant fraction of errors.  The article is based on Dr. Venkatesan Guruswami's course notes <ref name="vg2">{{cite web|url=http://www.cs.washington.edu/education/courses/cse590vg/03wi/scribes/1-27.ps|archive-url=https://web.archive.org/web/20140224135408/https://courses.cs.washington.edu/courses/cse590vg/03wi/scribes/1-27.ps|archive-date=2014-02-24|title=Lecture 5 |work=CSE590G: Codes and Pseudorandom Objects|publisher=University of Washington|first1=Venkatesan|last1=Guruswami|first2=Matt|last2=Cary|date=January 27, 2003}}</ref>
+== Code construction ==
+Zemor's algorithm is based on a type of [[expander graphs]] called [[Tanner graph]]. The construction of code was first proposed by Tanner.<ref name=vg3>{{cite web |url=http://www.cs.washington.edu/education/courses/cse533/06au/lecnotes/lecture14.pdf |title=Lecture notes|website=washington.edu|access-date=9 April 2023}}</ref> The codes are based on [[double cover (topology)|double cover]] <math>d</math>, regular expander <math>G</math>, which is a bipartite graph. <math>G</math> =<math> \left(V,E\right)</math>, where <math>V</math> is the set of vertices and <math>E</math> is the set of edges and <math>V</math> = <math>A</math> <math>\cup</math> <math>B</math> and <math>A</math> <math>\cap</math> <math>B</math> = <math>\emptyset</math>, where <math>A</math> and <math>B</math> denotes sets of vertices. Let <math>n</math> be the number of vertices in each group, ''i.e'', <math>|A| =|B| =n</math>. The edge set <math>E</math> be of size <math>N</math> =<math>nd</math> and every edge in <math>E</math> has one endpoint in both <math>A</math> and <math>B</math>. <math>E(v)</math> denotes the set of edges containing <math>v</math>.
+Assume an ordering on <math>V</math>, therefore ordering will be done on every edges of <math>E(v)</math> for every <math>v \in V</math>. Let [[finite field]] <math>\mathbb{F}=GF(2)</math>, and for a word  <math>x=(x_e), e\in E</math> in <math>\mathbb{F}^N</math>, let the subword of the  word will be indexed by <math>E(v)</math>. Let that word be denoted by <math>(x)_v</math>. The subset of vertices <math>A</math> and <math>B</math> induces every word <math>x\in \mathbb{F}^N</math> a partition into <math>n</math> non-overlapping sub-words <math> \left(x\right)_v\in \mathbb{F}^d</math>, where <math>v</math> ranges over the elements of <math>A</math>.
+For constructing a code <math>C</math>, consider a linear subcode <math>C_o</math>, which is a <math>[d,r_od,\delta] </math> code, where <math>q</math>, the size of the alphabet is <math>2</math>. For any vertex <math>v \in V</math>, let <math> v(1), v(2),\ldots,v(d)</math> be some ordering of the <math>d</math> vertices of <math>E</math> adjacent to <math>v</math>. In this code, each bit <math>x_e</math> is linked with an edge <math>e</math> of <math>E</math>.
+We can define the code <math>C</math> to be the set of binary vectors <math>x = \left( x_1,x_2,\ldots,x_N \right)</math> of <math>\{0,1\}^N </math> such that, for every vertex <math>v</math> of <math>V</math>, <math> \left(x_{v(1)}, x_{v(2)},\ldots, x_{v(d)}\right) </math> is a code word of <math>C_o</math>. In this case, we can consider a special case when every edge of <math>E</math> is adjacent to exactly <math>2</math> vertices of <math>V</math>. It means that <math>V</math> and <math>E</math> make up, respectively, the vertex set and edge set of <math>d</math> regular graph <math>G</math>.
+Let us call the code <math>C</math> constructed in this way as <math>\left(G,C_o\right) </math> code. For a given graph <math>G</math> and a given code <math>C_o</math>, there are several <math>\left(G,C_o\right) </math> codes as there are different ways of ordering edges incident to a given vertex <math>v</math>, i.e., <math> v(1), v(2),\ldots,v(d) </math>. In fact our code <math>C</math> consist of all codewords such that <math>x_v\in C_o</math> for all <math>v \in A, B</math>. The code <math> C</math> is linear <math>[N,K,D] </math> in <math>\mathbb{F}</math> as it is generated from a subcode <math>C_o</math>, which is linear. The code <math>C</math> is defined as <math>C=\{c\in \mathbb{F}^N :(c)_v \in C_o\}</math> for every <math>v\in V</math>.
+[[File:Zemor Decoding.jpg|thumb|upright=2.0|alt=A |Graph G and code C]]
+In this figure, <math>(x)_v =\left(x_{e1}, x_{e2}, x_{e3}, x_{e4}\right)\in C_o</math>. It shows the graph <math>G</math> and code <math>C</math>.
+In matrix <math>G</math>, let <math>\lambda</math> is equal to the second largest [[eigenvalue]] of [[adjacency matrix]] of <math>G</math>. Here the largest eigenvalue is <math> d</math>.
+Two important claims are made:
+=== Claim 1 ===
+'''<math>\left(\dfrac{K}{N}\right)\geq 2r_o-1</math>'''<br />
+''. Let <math>R</math> be the rate of a linear code constructed from a bipartite graph whose digit nodes have degree <math>m</math> and whose subcode nodes have degree <math>n</math>. If a single linear code with parameters <math>\left(n,k\right)</math> and rate <math>r = \left(\dfrac{k}{n}\right)</math> is associated with each of the subcode nodes, then <math>k\geq 1-\left(1-r\right)m</math>''.
+==== Proof ====
+Let <math>R</math> be the rate of the linear code, which is equal to <math>K/N</math>
+Let there are <math>S</math> subcode nodes in the graph. If the degree of the subcode is <math>n</math>, then the code must have <math>\left(\dfrac{n}{m}\right) S</math> digits, as each digit node is connected to <math>m</math> of the <math>\left(n\right)S</math> edges in the graph. Each subcode node contributes <math>(n-k)</math> equations to parity check matrix for a total of <math>\left(n-k\right) S</math>. These equations may not be linearly independent.
+Therefore, <math>\left(\dfrac{K}{N}\right)\geq \left(\dfrac{(\dfrac{n}{m})S - (n-k)S}{(\dfrac{n}{m}) S}\right)</math><br />
+<math>\geq 1-m\left(\dfrac{n-k}{n}\right)</math><br />
+<math>\geq 1-m \left(1-r\right) </math>, Since the value of <math>m</math>, i.e., the digit node of this bipartite graph is <math>2</math> and here <math>r = r_o</math>, we can write as:<br />
+<math>\left(\dfrac{K}{N}\right)\geq 2r_o -1</math>
+=== Claim 2 ===
+: <math>D\geq N\left(\dfrac{(\delta-(\dfrac{\lambda}{d}))}{(1-(\dfrac{\lambda}{d})})\right)^2</math>
+: <math>=N\left(\delta^2- O\left(\dfrac{\lambda}{d}\right)\right)</math> <math>\rightarrow (1)</math>
+''If <math> S </math> is linear code of rate <math>r</math>, block code length <math> d</math>, and minimum relative distance <math>\delta</math>, and if <math>B</math> is the edge vertex incidence graph of a <math>d</math> – regular graph with second largest eigenvalue <math>\lambda</math>, then the code <math>C(B,S)</math> has rate at least <math>2r_o -1</math> and minimum relative distance at least <math>\left(\left(\dfrac{\delta- \left(\dfrac{\lambda}{d}\right)}{1-\left(\dfrac{\lambda}{d}\right)}\right)\right)^2</math>.
+==== Proof ====
+Let <math> B</math> be derived from the <math>d</math> regular graph <math>G</math>. So, the number of variables of <math>C(B,S)</math> is <math> \left(\dfrac{dn}{2}\right)</math> and the number of constraints is <math>n</math>. According to Alon - Chung,<ref name="vg4">{{cite journal |author1=N. Alon |author2=F.R.K. Chung |title=Explicit construction of linear sized tolerant networks |journal=[[Discrete Mathematics (journal)|Discrete Mathematics]] |date=December 1988 |volume=72 |issue=1–3 |pages=15–19 |doi=10.1016/0012-365X(88)90189-6|citeseerx=10.1.1.300.7495 }}</ref> if <math>X</math> is a subset of vertices of <math>G</math> of size <math>\gamma n</math>, then the number of edges contained in the subgraph is induced by <math>X</math> in <math>G</math> is at most <math>\left(\dfrac{dn}{2}\right) \left(\gamma^2 + (\dfrac{\lambda}{d})\gamma \left(1-\gamma\right)\right)</math>.
+As a result, any set of <math>\left(\dfrac{dn}{2}\right) \left(\gamma^2 + \left(\dfrac{\lambda}{d}\right)\gamma \left(1-\gamma\right)\right)</math> variables will be having at least <math>\gamma n</math> constraints as neighbours. So the average number of variables per constraint is : <math>\left(\dfrac{(\dfrac{2nd}{2}) \left(\gamma^2 + (\dfrac{\lambda}{d})\gamma \left(1-\gamma\right)\right)}{\gamma n}\right)</math>
+<math>= d\left( \gamma + (\dfrac{\lambda}{d}) \left( 1-\gamma\right)\right)</math>           <math>\rightarrow (2)</math>
+So if <math> d\left( \gamma + (\dfrac{\lambda}{d}) \left( 1-\gamma\right)\right) < \gamma d</math>, then a word of relative weight  <math>  \left(\gamma^2 + (\dfrac{\lambda}{d})\gamma \left(1-\gamma\right)\right)</math>, cannot be a codeword of <math>C(B,S)</math>. The inequality <math>(2)</math> is satisfied for <math>\gamma < \left(\dfrac{1-(\dfrac{\lambda}{d})}{\delta-(\dfrac{\lambda}{d})}\right)</math>. Therefore, <math>C(B,S)</math> cannot have a non zero codeword of relative weight   <math>\left(\dfrac{\delta-(\dfrac{\lambda}{d})}{1-(\dfrac{\lambda}{d})}\right)^2</math> or less.
+In matrix <math>G</math>, we can assume that <math>\lambda/d</math> is bounded away from <math>1</math>. For those values of <math> d</math> in which <math>d-1</math> is odd prime, there are explicit constructions of sequences of <math> d</math> - regular bipartite graphs with arbitrarily large number of vertices such that each graph <math>G</math> in the sequence is a  [[Ramanujan graph]]. It is called Ramanujan graph as it satisfies the inequality <math>\lambda(G) \leq 2\sqrt{d-1}</math>. Certain expansion properties are visible in graph <math>G</math> as the separation between the eigenvalues <math>d</math> and <math> \lambda </math>. If the graph <math> G </math> is Ramanujan graph, then that expression  <math>(1)</math> will become <math>0</math> eventually as <math> d </math> becomes large.
+== Zemor's algorithm ==
+The iterative decoding algorithm written below alternates between the vertices <math>A</math> and <math>B</math> in <math>G</math> and corrects the codeword of <math>C_o</math> in <math>A</math> and then it switches to correct the codeword <math>C_o</math> in <math>B</math>. Here edges associated with a vertex on one side of a graph are not incident to other vertex on that side. In fact, it doesn't matter in which order, the set of nodes <math>A</math> and <math>B</math> are processed. The vertex processing can also be done in parallel.
+The decoder <math>\mathbb{D}:\mathbb{F}^d \rightarrow C_o</math> stands for a decoder for <math>C_o</math> that recovers correctly with any codewords with less than <math>\left(\dfrac{d}{2}\right)</math> errors.
+=== Decoder algorithm ===
+Received word : <math>w=(w_e), e\in E</math><br />
+<code>
+<math>z \leftarrow w</math><br />
+For  <math>t \leftarrow 1</math> to <math>m</math> do         //<math>m</math> is the number of iterations<br />
+{ if (<math>t</math> is odd)                                       // Here the algorithm will alternate between its two vertex sets.<br />
+<math>X \leftarrow A</math><br />
+else  <math>X \leftarrow B</math>  <br />
+Iteration <math>t</math>: For every <math>v \in X</math>, let <math>(z)_v \leftarrow \mathbb{D}((z)_v)</math> // Decoding <math>z_v</math> to its nearest codeword.<br />
+}<br />
+</code>
+Output: <math>z</math>
+=== Explanation of the algorithm ===
+Since <math>G</math> is bipartite, the set <math>A</math> of vertices induces the partition of the edge set <math>E</math> = <math>\cup_{v\in A}  E_v</math> . The set <math>B</math> induces another partition, <math>E</math> = <math>\cup_{v\in B}  E_v</math> .
+Let <math>w \in \{0,1\}^N</math> be the received vector, and recall that <math>N=dn</math>. The first iteration of the algorithm consists of applying the complete decoding for the code induced by <math>E_v</math> for every <math>v \in A</math> . This means that for replacing, for every <math>v \in A</math>, the vector <math>\left( w_{v(1)}, w_{v(2)}, \ldots, w_{v(d)}\right)</math> by one of the closest codewords of <math>C_o</math>. Since the subsets of edges <math>E_v</math> are disjoint for <math>v \in A</math>, the decoding of these <math>n</math>  subvectors of <math>w</math> may be done in parallel.
+The iteration will yield a new vector <math>z</math>. The next iteration consists of applying the preceding procedure to <math>z</math> but with <math>A</math> replaced by <math>B</math>. In other words, it consists of decoding all the subvectors induced by the vertices of <math>B</math>. The coming iterations repeat those two steps alternately applying parallel decoding to the subvectors induced by the vertices of <math>A</math> and to the subvectors induced by the vertices of <math>B</math>. <br />
+'''Note:''' [If <math>d=n</math> and <math>G</math> is the complete bipartite graph, then <math>C</math> is a product code of <math>C_o</math> with itself and the above algorithm reduces to the natural hard iterative decoding of product codes].
+Here, the number of iterations, <math>m</math> is <math>\left(\dfrac{(\log{n})}{\log(2-\alpha)}\right)</math>.
+In general, the above algorithm can correct a code word whose Hamming weight is no more than <math>(\dfrac{1}{2}).\alpha N \delta \left((\dfrac{\delta}{2})-(\dfrac{\lambda}{d})\right) =\left((\dfrac{1}{4}).\alpha N (\delta^2- O(\dfrac{\lambda}{d})\right)</math> for values of <math>\alpha < 1</math>. Here, the decoding algorithm is implemented as a circuit of size <math>O(N \log{N} )</math> and depth <math>O(\log{N})</math> that returns the codeword given that error vector has weight less than <math>\alpha N \delta^2 (1-\epsilon)/4</math> .
+=== Theorem ===
+''If <math>G</math> is a Ramanujan graph of sufficiently high degree, for any <math>\alpha < 1</math>, the decoding algorithm can correct <math>(\dfrac{\alpha \delta_o^2}{4})(1-\in) N </math> errors, in <math> O(\log {n}) </math> rounds ( where the big- <math>O</math> notation hides a dependence on <math>\alpha</math>). This can be implemented in linear time on a single processor; on <math>n</math> processors each round can be implemented in constant time.''
+==== Proof ====
+Since the  decoding algorithm is insensitive to the value of the edges and by linearity, we can assume that the transmitted codeword is the all zeros - vector. Let the received codeword be <math>w</math>. The set of edges which has an incorrect value while decoding is considered. Here by incorrect value, we mean <math>1</math> in any of the bits. Let <math>w=w^0</math> be the initial value of the codeword, <math>w^1, w^2,\ldots, w^t</math> be the values after first, second&nbsp;.&nbsp;.&nbsp;. <math>t</math> stages of decoding.
+Here, <math>X^i={e\in E|x_e^i =1}</math>, and  <math>S^i ={v\in V^i | E_v \cap X^{i+1} !=\emptyset}</math>.  Here <math>S^i</math> corresponds to those set of vertices that was not able to successfully decode their codeword in the <math>i^{th}</math> round. From the above algorithm <math>S^1 <S^0 </math>  as number of unsuccessful vertices will be corrected in every iteration. We can prove that <math>S^0>S^1>S^2>\cdots</math> is a decreasing sequence.
+In fact, <math>|S_{i+1}|<=(\dfrac{1}{2-\alpha})|S_i|</math>. As we are assuming, <math>\alpha<1</math>, the above equation is in a [[geometric series|geometric decreasing sequence]].
+So, when <math>|S_i|<n</math>, more than <math>log_{2-\alpha} n </math> rounds are necessary. Furthermore, <math>\sum|S_i|=n\sum(\dfrac{1}{(2-\alpha)^i})=O(n)</math>, and if we implement the <math>i^{th}</math> round in <math>O(|S_i|)</math> time, then the total sequential running time will be linear.
+== Drawbacks of Zemor's algorithm ==
+# It is lengthy process as the number of iterations <math>m</math> in decoder algorithm takes is <math>[(\log{ n})/(\log(2-\alpha))]</math>
+# Zemor's decoding algorithm finds it difficult to decode erasures. A detailed way of how we can improve the algorithm is
+given in.<ref name="vg5">{{cite web|url=https://www.cs.technion.ac.il/~vitalys/Papers/GMD-expander/GMD-decode-expander.ps |title=Archived copy |access-date=May 1, 2012 |url-status=dead |archive-url=https://web.archive.org/web/20040914064028/https://www.cs.technion.ac.il/~vitalys/Papers/GMD-expander/GMD-decode-expander.ps |archive-date=September 14, 2004 }}</ref>
+==See also==
+*[[Expander code]]s
+*[[Tanner graph]]
+*[[Linear time encoding and decoding of error-correcting codes]]
+==References==
+{{reflist}}
+[[Category:Coding theory]]
+[[Category:Error detection and correction]]