Abstract | We consider random systems of linear equations over GF(2) in which every equation binds k variables. We obtain a precise description of the clustering of solutions in such systems. In particular, we prove that with probability that tends to 1 as the number of variables, n, grows: for every pair of solutions \sigma, \tau, either there exists a sequence of solutions \sigma,...,\tau, in which successive elements differ by O(log n) variables, or every sequence of solutions \sigma,...,\tau, contains a step requiring the simultaneous change of \Omega(n) variables. Furthermore, we determine precisely which pairs of solutions are in each category. Our results are tight and highly quantitative in nature. Moreover, our proof highlights the role of unique extendability as the driving force behind the success of Low Density Parity Check codes and our techniques also apply to the problem of so-called pseudo-codewords in such codes. |