Innovative Application of Blockchain Technology for Digital Recipe Copyright Protection
Abstract
:1. Introduction
- A novel blockchain-based copyright protection scheme for recipes is proposed to provide all participants with a fair, secure, and trustworthy digital environment.
- The traditional SimHash algorithm is improved for the problem of imperfect audit mechanisms. The method combines TF-IDF and PageRank to improve Simhash. Compared with the conventional model, this new approach introduces the consideration of semantic information and word importance while ensuring the word frequency, significantly improving accuracy and recall.
- To solve the centrality problem of the traditional DPOS consensus mechanism, a specific consensus mechanism based on the quality of recipes is proposed. The DPOS consensus mechanism is effectively improved by considering recipes’ completeness, logic, and detail to calculate the nodes’ rank.
2. Literature Review
3. Schematic Design
3.1. Problem Analysis
- When nodes transmit recipe data, we need to use a copyright review algorithm to conduct text similarity detection on the node’s data to prevent the occurrence of inappropriate content events;
- Solve the problem that the DPOS consensus mechanism is generally questioned as too centralized and the human operation space in the proxy bookkeeping nodes’ election process;
- When an incident of unauthorized use occurs, we need to ensure that rights holders can quickly trace their copyrights to protect the legitimate rights and interests of creators.
3.2. System Design
3.2.1. System Composition and User Interaction Flow
3.2.2. Blockchain Data Upload and Download
Algorithm 1 RecipeUploadContract |
|
Algorithm 2 RecipeDownloadContract |
|
4. Text Similarity Detection
4.1. SimHash Algorithm
4.2. PageRank Algorithm
- Initialization of PageRank values:
- First iteration:
4.3. Improved SimHash Algorithm
Algorithm 3 Sentence Similarity Calculation |
|
4.4. Results of the Experiment
4.4.1. Datasets
- Delete words or sentences: Delete parts of the ingredients or parts of the steps of the steps with a probability of 20%;
- Replace ingredients: Replace some ingredients or cooking methods in a recipe with a predefined list of ingredient synonyms, e.g., replace “red pepper” with [“red chili pepper”, “red bell pepper”] or replace “stir-fry” with [“boil”, “slow stir-fry”];
- Reordering words or statements: Swapping the order of some words and statements;
4.4.2. Experimental Environment and Comparison Algorithms
- SimHash algorithm [15]: SimHash is a technique that transforms high-dimensional feature vectors into low-dimensional feature fingerprints. It generates fingerprints by calculating the statistical characteristics of the input data and compares these fingerprints using the Hamming distance. Due to its dimensionality reduction properties, SimHash is capable of effectively handling large-scale or high-dimensional data, reducing the complexity of storage and computation. It is particularly suited for detecting near-duplicate contents, such as in web crawling or copyright detection scenarios.
- Method based on pretrained word models [37]: This method leverages word vector models that are pretrained on extensive corpora, like Word2Vec or GloVe. Through these models, each word can be translated into a vector of fixed size, capturing the semantic information and relationships between words. Averaging or weighted averaging of word vectors in a sentence or document can derive a vector representation of the sentence or document. The strength of this method lies in its ability to capture the deep semantic information of texts.
- Vector space model (VSM) combined with TF-IDF [38]: The VSM is a method of representing text as vectors, where each dimension represents a specific word or term. TF-IDF is a statistical method used to evaluate the importance of a word in a collection of documents, with TF representing term frequency and IDF denoting inverse document frequency. Combining VSM with TF-IDF results in a weighted feature vector of the text. Subsequently, cosine similarity can be utilized to compute the similarity between these vectors. This method’s advantage is that it captures the term frequency and significance in the text, thereby enhancing the accuracy of similarity computation.
- WMF-LDA [39]: The WMF-LDA algorithm utilizes the semantic and part-of-speech information of words, as well as the domain differences between different types of texts, to improve the application of the traditional LDA model in the field of text similarity computation. This algorithm first maps domain words and synonyms to a unified word through the word2vec model, filters out irrelevant words based on their part-of-speech, and finally uses the LDA model to perform topic modeling on the text. The JS distance is then used to calculate the similarity between texts.
4.4.3. Experimental Results and Analysis
4.5. Time Complexity Analysis
- Feature extraction: ;
- Feature weighting: ;
- Constructing SimHash value: , where b is the length of the SimHash value. Assuming a value of 64, then ;
- SVM training (in the worst case): . Assuming we have samples, then .
- Vocabulary amalgamation and filtering: ;
- WMF_LDA topic modeling: ;
- Text similarity computation: .
5. Improvement of Consensus Mechanisms and Quality Testing of Recipes
5.1. DPOS (Delegated Proof of Stake)
- The enthusiasm of node voting is insufficient;
- Malicious behavior cannot be stopped in time;
- There is a lot of room for human manipulation in the voting process, so it is widely questioned as being too centralized.
5.2. Improved DPOS Algorithm
- Non-production of blocks: Some nodes refuse to generate blocks after they become verifier nodes, which is likely to cause delays in the network and reduce the system’s operational efficiency;
- Double signatures: Supporting two different types of blockchain at the same time can lead to the generation of double payments and other problems, seriously affecting the balance of the blockchain;
- Inactivity: Validator nodes must actively participate in the network to validate transactions and generate new blocks. If a validator node is inactive for a long period, then it may be considered a malicious node by the network;
- Denial of service attacks: Validator nodes may still attempt to block the network connectivity of other nodes by sending many invalid transactions or blocks;
- Transaction censorship: Validator nodes selectively pack transactions for some reason;
- Validator nodes that produce malicious behavior cancel their scores and are treated as normal nodes;
- For nodes that have had malicious behavior to transmit data again for scoring, an exponential decay factor (F) is introduced.
5.3. Dynamic Update of Node Rank
5.4. Recipe Evaluation Criteria
Algorithm 4 Recipe Scoring Algorithm |
|
5.5. Experimental Comparison
5.5.1. Dataset
5.5.2. Experimental Setting, Experimental Design, and Analysis of Results
6. Performance Analysis
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mizrahi, M.; Golan, A.; Mizrahi, A.B.; Gruber, R.; Lachnise, A.Z.; Zoran, A. Digital gastronomy: Methods & recipes for hybrid cooking. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, Tokyo, Japan, 16–19 October 2016; pp. 541–552. [Google Scholar]
- Subramanya, S.; Yi, B.K. Digital rights management. IEEE Potentials 2006, 25, 31–34. [Google Scholar] [CrossRef]
- Ma, Z.; Jiang, M.; Gao, H.; Wang, Z. Blockchain for digital rights management. Future Gener. Comput. Syst. 2018, 89, 746–764. [Google Scholar] [CrossRef]
- Holland, M.; Nigischer, C.; Stjepandić, J. Copyright protection in additive manufacturing with blockchain approach. In Transdisciplinary Engineering: A Paradigm Shift; IOS Press: Amsterdam, The Netherlands, 2017; pp. 914–921. [Google Scholar]
- Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. Decentralized Bus. Rev. 2008. Available online: https://assets.pubpub.org/d8wct41f/31611263538139.pdf (accessed on 28 August 2023).
- Wood, G. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Proj. Yellow Pap. 2014, 151, 1–32. [Google Scholar]
- Taherdoost, H. The Role of Blockchain in Medical Data Sharing. Cryptography 2023, 7, 36. [Google Scholar] [CrossRef]
- Garba, A.; Dwivedi, A.D.; Kamal, M.; Srivastava, G.; Tariq, M.; Hasan, M.A.; Chen, Z. A digital rights management system based on a scalable blockchain. Peer-Netw. Appl. 2021, 14, 2665–2680. [Google Scholar] [CrossRef]
- Bodó, B.; Gervais, D.; Quintais, J.P. Blockchain and smart contracts: The missing link in copyright licensing? Int. J. Law Inf. Technol. 2018, 26, 311–336. [Google Scholar] [CrossRef]
- Sun, J. Current Status of Digital Infringement and Copyright Protection in Scientific Journals—A Preliminary Exploration of the Feasibility of Blockchain Technology (科技期刊数字侵权现状与版权保护——区块链技术可行性初探). Chin. J. Sci. Tech. Period. (中国科技期刊研究) 2018, 29, 1000–1005. [Google Scholar]
- Azaria, A.; Ekblaw, A.; Vieira, T.; Lippman, A. Medrec: Using blockchain for medical data access and permission management. In Proceedings of the 2016 2nd International Conference on Open and Big Data (OBD), Vienna, Austria, 22–24 August 2016; pp. 25–30. [Google Scholar]
- Natgunanathan, I.; Praitheeshan, P.; Gao, L.; Xiang, Y.; Pan, L. Blockchain-based audio watermarking technique for multimedia copyright protection in distribution networks. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2022, 18, 1–23. [Google Scholar] [CrossRef]
- Zhao, F.; Zhou, W. Analysis of Digital Copyright Protection Based on Blockchain Technology (基于区块链技术保护数字版权问题探析). Sci. Technol. Law Rev. (科技与法律) 2017, 125, 59–70. [Google Scholar]
- Zhang, J. Research and Implementation of Digital Copyright System Based on Blockchain Technology (基于区块链技术的数字版权系统研究与实现). Ph.D. Thesis, Jiangsu University, Zhenjiang, China, 2022. [Google Scholar] [CrossRef]
- Charikar, M.S. Similarity estimation techniques from rounding algorithms. In Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, Montreal, QC, Canada, 19–21 May 2002; pp. 380–388. [Google Scholar]
- Uddin, M.S.; Roy, C.K.; Schneider, K.A.; Hindle, A. On the effectiveness of simhash for detecting near-miss clones in large scale software systems. In Proceedings of the 2011 18th Working Conference on Reverse Engineering, Limerick, Ireland, 17–20 October 2011; pp. 13–22. [Google Scholar]
- Zou, W.; Lo, D.; Kochhar, P.S.; Le, X.B.D.; Xia, X.; Feng, Y.; Chen, Z.; Xu, B. Smart contract development: Challenges and opportunities. IEEE Trans. Softw. Eng. 2019, 47, 2084–2106. [Google Scholar] [CrossRef]
- Bellare, M.; Yee, B. Forward-security in private-key cryptography. In Proceedings of the Topics in Cryptology—CT-RSA 2003: The Cryptographers’ Track at the RSA Conference 2003, San Francisco, CA, USA, 13–17 April 2003; pp. 1–18. [Google Scholar]
- Salomaa, A. Public-Key Cryptography; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
- Benet, J. Ipfs-content addressed, versioned, p2p file system. arXiv 2014, arXiv:1407.3561. [Google Scholar]
- Ateniese, G.; Fu, K.; Green, M.; Hohenberger, S. Improved proxy re-encryption schemes with applications to secure distributed storage. ACM Trans. Inf. Syst. Secur. (TISSEC) 2006, 9, 1–30. [Google Scholar] [CrossRef]
- Kent, C.K.; Salim, N. Features based text similarity detection. arXiv 2010, arXiv:1001.3487. [Google Scholar]
- Buyrukbilen, S.; Bakiras, S. Secure similar document detection with simhash. In Proceedings of the Secure Data Management: 10th VLDB Workshop, SDM 2013, Trento, Italy, 30 August 2013; Springer: Cham, Switzerland, 2014; pp. 61–75. [Google Scholar]
- Aizawa, A. An information-theoretic perspective of tf–idf measures. Inf. Process. Manag. 2003, 39, 45–65. [Google Scholar] [CrossRef]
- Page, L.; Brin, S.; Motwani, R.; Winograd, T. The Pagerank Citation Ranking: Bring Order to the Web; Technical Report; Stanford University: Stanford, CA, USA, 1998. [Google Scholar]
- Brin, S.; Page, L. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 1998, 30, 107–117. [Google Scholar] [CrossRef]
- Toutanova, K.; Klein, D.; Manning, C.D.; Singer, Y. Feature-rich part-of-speech tagging with a cyclic dependency network. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, AB, Canada, 27 May–1 June 2003; pp. 252–259. [Google Scholar]
- Schofield, A.; Mimno, D. Comparing apples to apple: The effects of stemmers on topic models. Trans. Assoc. Comput. Linguist. 2016, 4, 287–300. [Google Scholar] [CrossRef]
- Vasiliev, Y. Natural Language Processing with Python and spaCy: A Practical Introduction; No Starch Press: San Francisco, CA, USA, 2020. [Google Scholar]
- Church, K.W. Word2Vec. Nat. Lang. Eng. 2017, 23, 155–162. [Google Scholar] [CrossRef]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Li, B.; Han, L. Distance weighted cosine similarity measure for text classification. In Proceedings of the Intelligent Data Engineering and Automated Learning–IDEAL 2013: 14th International Conference, IDEAL 2013, Hefei, China, 20–23 October 2013; pp. 611–618. [Google Scholar]
- Norouzi, M.; Fleet, D.J.; Salakhutdinov, R.R. Hamming distance metric learning. Adv. Neural Inf. Process. Syst. 2012, 25. Available online: https://proceedings.neurips.cc/paper_files/paper/2012/file/59b90e1005a220e2ebc542eb9d950b1e-Paper.pdf (accessed on 28 August 2023).
- Cherkassky, V.; Ma, Y. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 2004, 17, 113–126. [Google Scholar] [CrossRef]
- Agirre, E.; Banea, C.; Cardie, C.; Cer, D.; Diab, M.; Gonzalez-Agirre, A.; Guo, W.; Mihalcea, R.; Rigau, G.; Uria, L. SemEval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), San Diego, CA, USA, 16–17 June 2016; pp. 497–511. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Zhai, C.; Lafferty, J. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA, USA, 9–13 September 2001; pp. 334–342. [Google Scholar]
- Jelodar, H.; Wang, Y.; Yuan, C.; Feng, X.; Jiang, X.; Li, Y.; Zhao, L. Latent Dirichlet Allocation (LDA) and Topic modeling: Models, applications, a survey. arXiv 2018, arXiv:1711.04305. [Google Scholar] [CrossRef]
- Luo, Y.; Chen, Y.; Chen, Q.; Liang, Q. A new election algorithm for DPos consensus mechanism in blockchain. In Proceedings of the 2018 7th International Conference on Digital Home (ICDH), Guilin, China, 30 November–1 December 2018; pp. 116–120. [Google Scholar]
- Saad, S.M.S.; Radzi, R.Z.R.M. Comparative review of the blockchain consensus algorithm between proof of stake (pos) and delegated proof of stake (dpos). Int. J. Innov. Comput. 2020, 10. [Google Scholar] [CrossRef]
- Cao, B.; Zhang, Z.; Feng, D.; Zhang, S.; Zhang, L.; Peng, M.; Li, Y. Performance analysis and comparison of PoW, PoS and DAG based blockchains. Digit. Commun. Networks 2020, 6, 480–485. [Google Scholar] [CrossRef]
- Baliga, A. Understanding blockchain consensus models. Persistent 2017, 4, 14. [Google Scholar]
- Yaga, D.; Mell, P.; Roby, N.; Scarfone, K. Blockchain technology overview. arXiv 2019, arXiv:1906.11078. [Google Scholar]
- Segall, A. Distributed network protocols. IEEE Trans. Inf. Theory 1983, 29, 23–35. [Google Scholar] [CrossRef]
Iteration | PR(The1) | PR(Cat) | PR(Sitting) | PR(On) | PR(Mat) | PR(The2) |
---|---|---|---|---|---|---|
0 | 0.0250 | 0.0250 | 0.0250 | 0.0250 | 0.0250 | 0.0250 |
1 | 0.0250 | 0.0569 | 0.0975 | 0.0975 | 0.0569 | 0.0250 |
2 | 0.0250 | 0.0704 | 0.1263 | 0.1263 | 0.0704 | 0.0250 |
3 | 0.0250 | 0.0801 | 0.1427 | 0.1427 | 0.0801 | 0.0250 |
4 | 0.0250 | 0.0859 | 0.1509 | 0.1509 | 0.0859 | 0.0250 |
5 | 0.0250 | 0.0895 | 0.1555 | 0.1555 | 0.0895 | 0.0250 |
6 | 0.0250 | 0.0919 | 0.1581 | 0.1581 | 0.0919 | 0.0250 |
7 | 0.0250 | 0.0934 | 0.1597 | 0.1597 | 0.0934 | 0.0250 |
8 | 0.0250 | 0.0944 | 0.1607 | 0.1607 | 0.0944 | 0.0250 |
9 | 0.0250 | 0.0951 | 0.1612 | 0.1612 | 0.0951 | 0.0250 |
10 | 0.0250 | 0.0955 | 0.1615 | 0.1615 | 0.0955 | 0.0250 |
Algorithm | Dataset Size | Number of Similar Data | FN | FP |
---|---|---|---|---|
VSM | 1419 | 796 | 240 | 81 |
SimHah | 1419 | 796 | 188 | 126 |
Pretrained word model | 1419 | 796 | 163 | 179 |
P-SimHash | 1419 | 796 | 150 | 87 |
WMF-LDA | 1419 | 796 | 136 | 110 |
Algorithm | Dataset Size | Number of Similar Data | FN | FP |
---|---|---|---|---|
VSM | 600 | 300 | 33 | 48 |
SimHah | 600 | 300 | 32 | 57 |
Pretrained word model | 600 | 300 | 31 | 43 |
P-SimHash | 600 | 300 | 18 | 29 |
WMF-LDA | 600 | 300 | 30 | 15 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, L.; Liu, S.; Ma, C.; Su, T. Innovative Application of Blockchain Technology for Digital Recipe Copyright Protection. Appl. Sci. 2023, 13, 9803. https://doi.org/10.3390/app13179803
Zhang L, Liu S, Ma C, Su T. Innovative Application of Blockchain Technology for Digital Recipe Copyright Protection. Applied Sciences. 2023; 13(17):9803. https://doi.org/10.3390/app13179803
Chicago/Turabian StyleZhang, Linlu, Shuxian Liu, Chengji Ma, and Tingting Su. 2023. "Innovative Application of Blockchain Technology for Digital Recipe Copyright Protection" Applied Sciences 13, no. 17: 9803. https://doi.org/10.3390/app13179803
APA StyleZhang, L., Liu, S., Ma, C., & Su, T. (2023). Innovative Application of Blockchain Technology for Digital Recipe Copyright Protection. Applied Sciences, 13(17), 9803. https://doi.org/10.3390/app13179803