Efficient Context-Preserving Encoding and Decoding of Compositional Structures Using Sparse Binary Representations
Abstract
:1. Introduction
- We present a novel algorithm for encoding compositional SDR structures, termed context-preserving SDR encoding (CPSE). Our method builds on the ideas of the CDT [29] method while addressing some of its weaknesses.
- We propose a novel decoding algorithm called context-preserving SDR decoding (CPSD) based on triadic memory [30].
- CPSE encoding improves upon CDT by incorporating information about the sequence of base-level components within the composite SDR.
- CPSE encoding improves upon CDT by allowing us to optimize the convergence speed to achieve the desired output SDR sparsity.
- We show that CPSE encoding requires a near-constant amount of computation for order-invariant structures containing more than four components, irrespective of the base-level components.
- We show that CPSD decoding allows for accurate retrieval of the identity and order of base components from a composite SDR.
- We show the trade-off between SDR size and the number of elements that can be encoded while maintaining a low decoding error rate.
- We detail a pure compute-based decoding method applicable in a simple setting when base-level components are known and a TM-based method for the general case when the number of possible base-level components is huge.
- We show the trade-off and optimum TM memory size needed to achieve acceptable decoding bit error probability.
2. Materials and Methods
2.1. Background
2.1.1. Related Works
2.1.2. Sparse Binary Representation (SDR)
2.2. Problem Statement
- Given X, the encoder will allow the creation of composite SDR Y representation using M component vectors from X as input.
- Encoder output SDR will have size n and sparsity similar to base-level components.
- The encoding method should be deterministic. The same input should result in the same output.
- As similarity between X inputs increase, so the similarity (overlap score) between encoder output SDRs increases.
- The encoder output SDR will contain information about the identity of base-level components and order of composition in a way that allows decoding.
- It will be possible to find the number of base components M in Y.
- The decoder will allow us to find the identity of all M base component vectors and the order of their composition in Y with a low probability of error.
- Given , it will be possible to find if is part of Y and its position in the composition.
2.3. CDT Encoding Method
2.3.1. Additive CDT Method
- (Union of M SDRs in X);
- ;
- Create a random permutation matrix P;
- ;
- (apply random permutation);
- ;
- .Toy example to explain the algorithm steps:x1 = [1 1 0 0 0 0 0 0 0 0].x2 = [0 0 0 0 0 0 1 1 0 0].
- = [1 1 0 0 0 0 1 1 0 0].
- P = Shift 1 position to the right.
- = [1 1 0 0 0 0 1 1 0 0].
- = [0 1 1 0 0 0 0 1 1 0].
- = 0 | ([1 1 0 0 0 0 1 1 0 0] & [0 1 1 0 0 0 0 1 1 0]) = [0 1 0 0 0 0 0 1 0 0].
- Completed after 1 iteration.
2.3.2. Additive CDT Performance Analysis
2.3.3. Substructive CDT Method
- (Union of M SDRs in X);
- ;
- Create a random permutation matrix P;
- ;
- (apply random permutation);
- ;
- .Toy example to explain the algorithm steps:x1 = [1 1 0 0 0 0 0 0 0 0].x2 = [0 0 0 0 0 0 1 1 0 0].
- = [1 1 0 0 0 0 1 1 0 0].
- P = Shift by 1 position to the right.
- = [1 1 0 0 0 0 1 1 0 0].
- = [0 1 1 0 0 0 0 1 1 0].
- = [1 1 0 0 0 0 1 1 0 0] & [1 0 0 1 1 1 1 0 0 1] = [1 0 0 0 0 0 1 0 0 0].
- Completed after 1 iteration.
2.3.4. Substructive CDT Performance Analysis
2.4. Context-Preserving SDR Encoding (CPSE) Method
- At startup , compute a set of M random permutation matrices. For SDR vector size n, the permutation matrix size is n × n. Permutation matrices are used to encode information about the position of the base-level SDRs. For each position i = 1…M, a different permutation matrix is used. All invocations of the encoder will use the same matrix for position information encoding.
- The input to the encoder is , a set of random component vectors.
- Add position information to encoded elements by computing vector , where for each component , component i in is .
- (Union of M SDRs in ).
- Create a random permutation matrix P.Additive Phase
- .
- .
- (apply random permutation).
- .
- .Substructive Phase
- .
- .
- .For Generic CPSD Decoding
- For each component i in , store the following mapping in memory: .
- , a set of N random permutation matrices, one for each feature channel. The permutation matrix for channel i can be computed by , where is some random permutation and is raised to the i power.
- , a set of N random SDRs, one for each feature channel.
- Use the CPSE encoder to compute vector , such that for each component from feature channel j, the component in is the CPSE encoding of two SDRs .
- The input to CPSE encoding step 2 is instead of .
2.5. Context-Preserving SDR Decoding (CPSD) Method
2.5.1. Basic CPSD Decoding Procedure
- Calculate the first component: .
- For each possible component value , compute the overlap score with .
- Set to be with the maximum overlap score.
- Set the number of components M in Y to be as follows:
- Set the minimum match threshold = .
- For i = 2:M, perform the following:
- Calculate .
- For each possible component value , compute the overlap score with .
- Find a list of elements with overlap score .
- If the list contains a single element, set to be .If the list contains multiple elements, set to be with the maximum overlap score. Optionally, return K candidate elements for with K highest overlap scores; K is a parameter to the decoder.
2.5.2. Generic CPSD Decoding Procedure
- Query TM .
- Find the overlap score between and .
- If the overlap score < overlap threshold, stop the decoding procedure, as it is unable to decode Y.
- Set = .
- Set the number of components M in Y to be as follows:
- Set the minimum match threshold = .
- For i = 2:M, perform the following:
- Query TRM .
- If there is an overlap of = .
3. Results
3.1. CPSE Performance Analysis
3.2. CPSD Performance Analysis
4. Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
SDR | Binary Sparse Distributed Representation |
CNN | Convolution Neural Network |
CDT | Context-Dependent Thinning |
TM | Triadic Memory |
CPSE | Context-Dependent SDR Encoding |
CPSD | Context-Dependent SDR Decoding |
HDC | Hyperdimensional Computing |
VSA | Vector Symbolic Architectures |
TPS | Tensor Product Representations |
HRR | Holographic Reduced Representations |
SBC | Sparse Block Codes |
VTB | Vector-Derived Transformation Binding |
BDSC | Binary Sparse Distributed Codes |
BSC | Binary Spatter Codes |
References
- Malik, S.; Muhammad, K.; Waheed, Y. Artificial intelligence and industrial applications-A revolution in modern industries. Ain Shams Eng. J. 2024, 15, 102886. [Google Scholar] [CrossRef]
- Rambelli, G. Constructions and Compositionality: Cognitive and Computational Explorations; Cambridge University Press: Cambridge, UK, 2025. [Google Scholar] [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2323. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5999–6009. [Google Scholar] [CrossRef]
- Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI Open 2020, 1, 57–81. [Google Scholar] [CrossRef]
- Henderson, J. The Unstoppable Rise of Computational Linguistics in Deep Learning. arXiv 2020, arXiv:2005.06420. [Google Scholar] [CrossRef]
- Smolensky, P.; McCoy, R.T.; Fernandez, R.; Goldrick, M.; Gao, J. Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems. AI Mag. 2022, 43, 308–322. [Google Scholar] [CrossRef]
- Lennie, P. The Cost of Cortical Computation. Curr. Biol. 2003, 13, 493–497. [Google Scholar] [CrossRef]
- Hromádka, T.; DeWeese, M.R.; Zador, A.M. Sparse representation of sounds in the unanesthetized auditory cortex. PLoS Biol. 2008, 6, 124–137. [Google Scholar] [CrossRef]
- Weliky, M.; Fiser, J.; Hunt, R.H.; Wagner, D.N. Coding of natural scenes in primary visual cortex. Neuron 2003, 37, 703–718. [Google Scholar] [CrossRef]
- Kanerva, P. Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors. Cogn. Comput. 2009, 1, 139–159. [Google Scholar] [CrossRef]
- Gayler, R.W. Vector symbolic architectures answer jackendoff’s challenges for cognitive neuroscience. arXiv 2003, arXiv:cs/0412059. [Google Scholar] [CrossRef]
- Kleyko, D.; Rachkovskij, D.; Osipov, E.; Rahimi, A. A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part II: Applications, Cognitive Models, and Challenges. ACM Comput. Surv. 2023, 55, 1–52. [Google Scholar] [CrossRef]
- Neubert, P.; Schubert, S.; Protzel, P. An Introduction to Hyperdimensional Computing for Robotics. Künstliche Intell. 2019, 33, 319–330. [Google Scholar] [CrossRef]
- Rahimi, A.; Kanerva, P.; Benini, L.; Rabaey, J.M. Efficient Biosignal Processing Using Hyperdimensional Computing: Network Templates for Combined Learning and Classification of ExG Signals. Proc. IEEE 2019, 107, 123–143. [Google Scholar] [CrossRef]
- Rahimi, A.; Datta, S.; Kleyko, D.; Frady, E.P.; Olshausen, B.; Kanerva, P.; Rabaey, J.M. High-Dimensional Computing as a Nanoscalable Paradigm. IEEE Trans. Circuits Syst. I Regul. Pap. 2017, 64, 2508–2521. [Google Scholar] [CrossRef]
- Kleyko, D.; Davies, M.; Frady, E.P.; Kanerva, P.; Kent, S.J.; Olshausen, B.A.; Osipov, E.; Rabaey, J.M.; Rachkovskij, D.A.; Rahimi, A.; et al. Vector Symbolic Architectures as a Computing Framework for Emerging Hardware. Proc. IEEE 2022, 110, 1538–1571. [Google Scholar] [CrossRef]
- Ma, Y.; Hildebrandt, M.; Baier, S.; Tresp, V. Holistic representations for memorization and inference. UAI 2018, 1, 403–413. [Google Scholar] [CrossRef]
- Kleyko, D.; Rachkovskij, D.A.; Osipov, E.; Rahimi, A. A Survey on Hyperdimensional Computing aka Vector Symbolic Architectures, Part I: Models and Data Transformations. ACM Comput. Surv. 2022, 55, 1–40. [Google Scholar] [CrossRef]
- Frady, E.P.; Kleyko, D.; Sommer, F.T. Variable Binding for Sparse Distributed Representations: Theory and Applications. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 2191–2204. [Google Scholar] [CrossRef]
- Plate, T.A. Holographic reduced representations. IEEE Trans. Neural Netw. 1995, 6, 623–641. [Google Scholar] [CrossRef]
- Von Der Malsburg, C. The What and Why of Binding. Neuron 1999, 24, 95–104. [Google Scholar] [CrossRef] [PubMed]
- Kanerva, P. The Spatter Code for Encoding Concepts at Many Levels. In ICANN ’94; Marinaro, M., Morasso, P.G., Eds.; Springer: London, UK, 1994. [Google Scholar] [CrossRef]
- Smolensky, P. Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artif. Intell. 1990, 46, 159–216. [Google Scholar] [CrossRef]
- Yeung, C.; Zou, Z.; Imani, M. Generalized Holographic Reduced Representations. arXiv 2024, arXiv:2405.09689. [Google Scholar] [CrossRef]
- Rachkovskij, D.A. Representation and processing of structures with binary sparse distributed codes. IEEE Trans. Knowl. Data Eng. 2001, 13, 261–276. [Google Scholar] [CrossRef]
- Laiho, M.; Poikonen, J.H.; Kanerva, P.; Lehtonen, E. High-dimensional computing with sparse vectors. In Proceedings of the 2015 IEEE Biomedical Circuits and Systems Conference (BioCAS), Atlanta, GA, USA, 22–24 October 2015; pp. 1–4. [Google Scholar] [CrossRef]
- Gosmann, J.; Eliasmith, C. Vector-Derived Transformation Binding: An Improved Binding Operation for Deep Symbol-Like Processing in Neural Networks. Neural Comput. 2019, 31, 849–869. [Google Scholar] [CrossRef]
- Rachkovskij, D.A.; Kussul, E.M. Binding and normalization of binary sparse distributed representations by context-dependent thinning. Neural Comput. 2001, 13, 411–452. [Google Scholar] [CrossRef]
- Available online: https://peterovermann.com/TriadicMemory.pdf (accessed on 1 July 2024).
- Schlegel, K.; Neubert, P.; Protzel, P. A comparison of vector symbolic architectures. Artif. Intell. Rev. 2022, 55, 4523–4555. [Google Scholar] [CrossRef]
- Ahmad, S.; Hawkins, J. Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory. arXiv 2015, arXiv:1503.07469. [Google Scholar] [CrossRef]
- Purdy, S. Encoding Data for HTM Systems. arXiv 2016, arXiv:1602.05925. [Google Scholar] [CrossRef]
- Thomas, A.; Dasgupta, S.; Rosing, T. A Theoretical Perspective on Hyperdimensional Computing (Extended Abstract). IJCAI Int. Jt. Conf. Artif. Intell. 2022, 72, 5772–5776. [Google Scholar] [CrossRef]
Number of Components | FP N = 2000, s = 2% | FP N = 4000, s = 2% | |
---|---|---|---|
2 | 50% | 8.30 × 10−25 | 2.59 × 10−49 |
3 | 33% | 2.21 × 10−12 | 4.12 × 10−26 |
4 | 25% | 5.10 × 10−8 | 7.06 × 10−16 |
5 | 20% | 1.85 × 10−5 | 8.34 × 10−11 |
6 | 16% | 2.70 × 10−4 | 2.03 × 10−7 |
R/W | 0 | 1 | 2 | 4 | ≥5 | |
---|---|---|---|---|---|---|
1-bit weight | ||||||
500,000 | 100 | 0 | 0 | 0 | 0 | 0 |
600,000 | 98.7 | 1.2 | 0.1 | 0 | 0 | 0 |
700,000 | 0 | 1.6 | 5.3 | 11.4 | 17.2 | 64.5 |
2-bit weight | ||||||
600,000 | 100 | 0 | 0 | 0 | 0 | 0 |
700,000 | 99.9 | 0.1 | 0 | 0 | 0 | 0 |
800,000 | 97.1 | 2.8 | 0.1 | 0 | 0 | 0 |
900,000 | 43.7 | 46.1 | 9.2 | 0.8 | 0.1 | 0.1 |
1,000,000 | 0.1 | 6.2 | 20.5 | 29.8 | 24.4 | 19 |
4-bit weight | ||||||
500,000 | 100 | 0 | 0 | 0 | 0 | 0 |
600,000 | 99.9 | 0.1 | 0 | 0 | 0 | 0 |
700,000 | 98.5 | 1.4 | 0.1 | 0 | 0 | 0 |
800,000 | 88.4 | 11.3 | 0.2 | 0.1 | 0 | 0 |
900,000 | 57.1 | 37.9 | 4.5 | 0.3 | 0.1 | 0.1 |
1,000,000 | 22.1 | 49 | 23.4 | 4.8 | 0.6 | 0.1 |
1,100,000 | 4.8 | 27.9 | 36.8 | 21.6 | 7.1 | 1.8 |
8-bit weight | ||||||
500,000 | 100 | 0 | 0 | 0 | 0 | 0 |
600,000 | 99.9 | 0.1 | 0 | 0 | 0 | 0 |
700,000 | 98.2 | 1.7 | 0.1 | 0 | 0 | 0 |
800,000 | 86.1 | 13.5 | 0.3 | 0.1 | 0 | 0 |
900,000 | 53.4 | 40.3 | 5.7 | 0.4 | 0.1 | 0.1 |
1,000,000 | 18.7 | 47.3 | 26.6 | 6.4 | 0.9 | 0.1 |
1,100,000 | 2.8 | 20.8 | 35.1 | 26.4 | 11.2 | 3.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Malits, R.; Mendelson, A. Efficient Context-Preserving Encoding and Decoding of Compositional Structures Using Sparse Binary Representations. Information 2025, 16, 343. https://doi.org/10.3390/info16050343
Malits R, Mendelson A. Efficient Context-Preserving Encoding and Decoding of Compositional Structures Using Sparse Binary Representations. Information. 2025; 16(5):343. https://doi.org/10.3390/info16050343
Chicago/Turabian StyleMalits, Roman, and Avi Mendelson. 2025. "Efficient Context-Preserving Encoding and Decoding of Compositional Structures Using Sparse Binary Representations" Information 16, no. 5: 343. https://doi.org/10.3390/info16050343
APA StyleMalits, R., & Mendelson, A. (2025). Efficient Context-Preserving Encoding and Decoding of Compositional Structures Using Sparse Binary Representations. Information, 16(5), 343. https://doi.org/10.3390/info16050343