Abstract
The core challenge of steganography is always how to improve the hidden capacity and the concealment. Most current generation-based linguistic steganography methods only consider the probability distribution between text characters, and the emotion and topic of the generated steganographic text are uncontrollable. Especially for long texts, generating several sentences related to a topic and displaying overall coherence and discourse-relatedness can ensure better concealment. In this article, we address the problem of generating coherent multi-sentence texts for better concealment, and a topic-aware neural linguistic steganography method that can generate a steganographic paragraph with a specific topic is present. We achieve a topic-controllable steganographic long text generation by encoding the related entities and their relationships from Knowledge Graphs. Experimental results illustrate that the proposed method can guarantee both the quality of the generated steganographic text and its relevance to a specific topic. The proposed model can be widely used in covert communication, privacy protection, and many other areas of information security.
- Waleed Ammar, Dirk Groeneveld, Chandra Bhagavatula, Iz Beltagy, Miles Crawford, Doug Downey, Jason Dunkelberger, Ahmed Elgohary, Sergey Feldman, Vu Ha, Rodney Michael Kinney, Sebastian Kohlmeier, Kyle Lo, Tyler C. Murray, Hsu-Han Ooi, Matthew E. Peters, Joanna L. Power, Sam Skjonsberg, Lucy Lu Wang, Christopher Wilhelm, Zheng Yuan, Madeleine van Zuylen, and Oren Etzioni. 2018. Construction of the literature graph in semantic scholar. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT’18).Google Scholar
- Joost Bastings, Ivan Titov, Wilker Aziz, Diego Marcheggiani, and Khalil Sima’an. 2017. Graph convolutional encoders for syntax-aware neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17).Google Scholar
- Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2000. A neural probabilistic language model. J. Mach. Learn. Res. 3 (2000), 1137–1155. Google Scholar
Digital Library
- Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. Retrieved from https://arxiv.org/abs/1904.10509.Google Scholar
- Nopporn Chotikakamthorn. 1998. Electronic document data hiding technique using inter-character space. In Proceedings of the IEEE Asia-Pacific Conference on Circuits and Systems. Microelectronics and Integrating Systems. 419–422.Google Scholar
- Michael J. Denkowski and Alon Lavie. 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Workshop on Statistical Machine Translation ([email protected]’14).Google Scholar
- Abdelrahman Desoky. 2010. Comprehensive linguistic steganography survey. Int. J. Info. Comput. Secur. 4, 2 (2010), 164–197. Google Scholar
Digital Library
- Jessica Fridricha. 2009. Steganography in Digital Media: Principles, Algorithms, and Applications. Cambridge University Press, Cambridge, UK. Google Scholar
Digital Library
- Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. Retrieved from https://arxiv.org/abs/1603.06393.Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9 (1997), 1735–1780.Google Scholar
Digital Library
- Christophe Guyeux, Jean F. Couchot, and Raphael Couturier. 2015. STABYLO: Steganography with adaptive, Bbs, and binary embedding at low cost. Ann. Telecommun. 70, 9–10 (2015), 441–449.Google Scholar
- Lucai Wang Jianjun Zhang, Jun Shen, and Haijun Lin. 2016. Coverless text information hiding method based on the word rank map. In Proceedings of the International Conference on Cloud Computing and Security, Vol. 10039. Springer, Cham.Google Scholar
- Lucai Wang Jianjun Zhang, Yicheng Xie, and Haijun Lin. 2017. Coverless text information hiding method using the frequent words distance. In Proceedings of the International Conference on Cloud Computing and Security, Vol. 10602. Springer, Cham.Google Scholar
- Rik Koncel-Kedziorski, Dhanush Bekal, Yi Luan, Mirella Lapata, and Hannaneh Hajishirzi. 2019. Text generation from knowledge graphs with graph transformers. Retrieved from https://arxiv.org/abs/1904.02342.Google Scholar
- Chunfang Yang, Lingyun Xiang, Xinhui Wang, and Peng Liu. 2017. A novel linguistic steganography based on synonym run-length encoding. IEICE Trans. Info. Syst. 100, 2 (2017), 313–322.Google Scholar
- Gang Luo, Lingyun Xiang, Xingming Sun, and Bin Xia. 2014. Linguistic steganalysis using the features derived from synonym frequency. Multimedia Tools Appl. 71, 3 (2014), 1893–1911. Google Scholar
Digital Library
- Anandaprova Majumder and Suvamoy Changder. 2013. A novel approach for text steganography: Generating text summary using reflection symmetry. Procedia Technol. 10, 10 (2013), 112–120.Google Scholar
- H. Hernan Moraldo. 2014. An approach for text steganography based on markov chains. Retrieved from https://arxiv.org/abs/1409.0915.Google Scholar
- Brian Murphy and Carl Vogel. 2007. The syntax of concealment: Reliable methods for plain text information hiding. In Proceedings of the SPIE, Vol. 6505. Springer, Cham, 752–762.Google Scholar
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2001. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the Association for Computational Linguistics (ACL’01). Google Scholar
Digital Library
- Ning Qian. 1999. On the momentum term in gradient descent learning algorithms. Neural Netw.: Offic. J. Int. Neural Netw. Soc. 12 1 (1999), 145–151. Google Scholar
Digital Library
- Alec Radford. 2018. Improving language understanding by generative pre-training.Google Scholar
- Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners.Google Scholar
- Claude E. Shannon. 1949. Communication theory of secrecy systems. Bell Syst. Tech. J. 28 (1949), 656–715.Google Scholar
Cross Ref
- A. N. Shniperov and K. A. Nikitina. 2016. A text steganography method based on Markov chains. Autom. Control Comput. Sci. 50 (2016), 802–808.Google Scholar
Cross Ref
- Gustavus J. Simmons. 1983. The prisoners’ problem and the subliminal channel. In Proceedings of the International Cryptology Conference (CRYPTO’83). Google Scholar
Digital Library
- Dilip K. Yadav Susmita Mahato, and Danish A. Khan. 2020. A modified approach to data hiding in microsoft word documents by change-tracking technique. J. King Saud Univ. Comput. Info. Sci. 32 (Feb. 2020), 216–224.Google Scholar
- Martin Jaggi, Tina Fang, and Katerina Argyraki. 2017. Generating steganographic text with LSTMs. Commun. ACM (May 2017). Retrieved from https://arxiv.org/abs/1705.10742.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Conference and Workshop on Neural Information Processing Systems (NIPS’17). Google Scholar
Digital Library
- Yonghui Dai, Weihui Dai, Yue Yu, and Bin Deng. 2010. Text steganography system using Markov chain source model and des algorithm. J. Softw. 5, 7 (2010), 785–792.Google Scholar
Cross Ref
- Andreas Westfeld and Andreas Pfitzmann. 1999. Attacks on steganographic systems. In Information Hiding. Google Scholar
Digital Library
- Zhongliang Yang, Yuting Hu, Yongfeng Huang, and Yujin Zhang. 2019. Behavioral security in covert communication systems. Retrieved from https://arxiv.org/abs/1910.09759.Google Scholar
- Zhongliang Yang, Yongfeng Huang, and Yu-Jin Zhang. 2019. A fast and efficient text steganalysis method. IEEE Signal Process. Lett. 26 (2019), 627–631.Google Scholar
Cross Ref
- Jian Yuan, Yongfeng Huang, and Shanyu Tang. 2011. Steganography in inactive frames of VoIP streams encoded by source codec. IEEE Trans. Info. Forensics Secur. 6, 2 (June 2011), 296–306. Google Scholar
Digital Library
- Fufang Li, Yubo Luo, Yongfeng Huang, and Chinchen Chang. 2016. Text steganography based on ci-poetry generation using Markov chain model. KSII Trans. Internet Info. Syst. 10, 9 (2016), 4568–4584.Google Scholar
- Rohan Harit, Xianyi Chen, Zhili Zhou, Huiyu Sun, and Xingming Sun. 2015. Coverless image steganography without embedding. In Proceedings of the International Conference on Cloud Computing and Security. Springer, Cham, 123–132.Google Scholar
- Yongfeng Huang, Zhongliang Yang, and Xueshun Peng. 2017. A sudoku matrix-based method of pitch period steganography in low-rate speech coding. In Proceedings of the International Conference on Security and Privacy in Communication Systems. Springer, Cham, 752–762.Google Scholar
- Ziming Chen, Yongfeng Huang, Zhongliang Yang, Xiaoqing Guo, and Yu-Jin Zhang. 2018. RNN-Stega: Linguistic steganography based on recurrent neural networks. IEEE Trans. Info. Forensics Secur. (Sep. 2018), 1280–1295. DOI:https://doi.org/10.1109/TIFS.2018.2871746Google Scholar
- Zachary M. Ziegler, Yuntian Deng, and Alexander M. Rush. 2019. Neural linguistic steganography. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing (EMNLP/IJCNLP’19).Google Scholar
Index Terms
Topic-aware Neural Linguistic Steganography Based on Knowledge Graphs
Recommendations
A linguistic steganography based on word indexing compression and candidate selection
In this paper, a novel linguistic steganography with high imperceptibility and undetectability is proposed via secret message compression and candidate text selection. The length of the practical embedded payload can be reduced by the proposed word ...
Comprehensive linguistic steganography survey
Contemporary steganography approaches suffer from many serious deficiencies; generally, they attempt to hide data as detectable and suspicious noise in a cover that is assumed to look innocent. In addition, steganography approaches found in literature ...
A Statistical Algorithm for Linguistic Steganography Detection Based on Distribution of Words
ARES '08: Proceedings of the 2008 Third International Conference on Availability, Reliability and SecurityIn this paper, a novel statistical algorithm for linguistic steganography detection, which takes advantage of distribution of words in the text segment detected, is presented. Linguistic steganography is the art of using written natural language to hide ...






Comments