Computer-Generated Pastiche: Text Style Transfer
In Other Words: Text Style Transfer
Naitian Zhou
Thomas Jefferson High School for Science and Technology
This article was originally included in the 2018 print publication of Teknos Science Journal.
“We present initial investigation into the task of paraphrasing language while targeting a particular writing style” [11].
Lines of pseudo-English scrolled across the laptop screen at lightning speeds. The sentence, in layman’s terms, means “We tried to translate text to a specific writing style”. The two sentences mean the same thing; however, the latter is a “paraphrase” of the original. The ambiguity and variety of language – one of its largest advantages as a medium of communication – mean there are infinitely many distinct writing styles. In what style would an average person write that sentence? What about Twain or Shakespeare? How can an algorithm be developed to achieve this style transfer? How can the content of one document be merged with the writing style of another document?
The idea of text style transfer has existed long before computers, in a different form: pastiche. Pastiches are imitations of another’s writing style, often with a lighthearted spin. Today, popular author Neil Gaiman is recognized for his mastery of the pastiche, but its history stretches back to 11th century Syrian literature [1]. That’s not to say the concept is antiquated, though, as websites that offer translations from boring modern-day English into “Yodaish”, Pirate speak, and even “Swedish Chef” speak from The Muppets have popped up [4].
In 2016, Gatys, Ecker, & Bethge introduced a technique for style transfer between images using convolutional neural networks, which reduces images to abstract shapes. Convolutional neural networks separate the content, or underlying shapes, of an image from the style, which is manifested in the finer details. Their work laid a foundation for a comparable type of transfer, this time with language.
To achieve style transfer, one must balance two natural language processing tasks: replicating style and replicating content. Scientists have researched these tasks separately with moderate success.
Using recurrent neural networks on a character-by-character level, computers can generate text which closely resembles the input text. This type of algorithm works by learning to predict the next letter given a string of previous letters. Thus, if the input is “What is your nam”, the output should be “e”. Unfortunately, the output is not “e” with this approach, as it only concerns itself with letters, and not meaning. Although this type of algorithm may learn words and even sentence structures, the output will still be incomprehensible as researchers are still unable to code for the semantics of the output [6].
To teach computers to learn semantics over style, word and sentence levels, which contain meaning, must be examined instead of letters. Word vectors, or embeddings, are vectors of arbitrary length that represent a word. These embeddings can be taught to represent words by taking a corpus, which is a large body of text, and guessing each word in the corpus by looking at its surrounding words. Given the context “I drove my” and “to work today”, the center word can be predicted to be “car”. By looking at thousands of these examples for millions of iterations, the computer can generate vectors that embed meaning [9].
My research more closely resembles the latter approach to understanding text. In 2015, Kiros et al. applied the idea of generating word embeddings to sentences. This allowed for generating fixed-length vectors called skip-thought vectors, which embed the meaning of sentences. A useful side effect is, when embeddings are generated on a corpus with a consistent style, the style is also normalized within the vectors. This means that a style transfer can be applied to a sentence X by subtracting the mean of the vectors with its style and then adding the meaning of the vectors with the target style.
In June of 2017, researchers from MIT published a paper detailing a different method that explicitly targeted style transfer as the goal, as opposed to skip-thoughts which generated generic sentence embeddings. This method used a process of “cross-alignment,” in which models trained on two different corpora with two different styles were aligned so the input to one model could then be used to generate the output of another [10]. This method relies on an underlying assumption that content distributions are the same. Because the distinction between style and content is not clearly defined, the algorithm relies upon the overall distribution of content to be similar in order to separate out the style. That is, although the two corpora do not need to match up word for word, they should be about the same (T. Shen, personal communication, January 13, 2018).
In 2016, The Verge published a story about Eugenia Kuyda, a computer scientist who developed a chat-bot that mimicked the messages of her deceased friend. The article mentions Kuyda’s hesitance in creating this artificial intelligence. She questioned whether this would this aid her grieving or merely make the pain worse [8]. Artificial intelligence has progressed to a point where the line between humanity and computers is increasingly blurred, and where difficult moral and ethical questions must be answered before additional progress can be made.
Language style transfer is a new field that is being actively researched, and for good reason. Most directly, it can make increase the readability of dense texts by conveying the same information in a simpler language. It can also protect authors’ anonymities by creating a stylistic mask for their words. In a broader sense, successful style transfer means computers can both understand and generate language. Now, both the interpretation and generation of language, one of humanity’s defining features, may finally be accessible to machines.
References
[1] Andrea, A. J. (n.d.). Muslim sources for the crusades. Retrieved from ABC-CLIO database. (Accession No. 585323)
[2] Ferguson, A. (n.d.). Neil Gaiman. Retrieved from ABC-CLIO database. (Accession No. 585323)
[3] Fu, Z., Tan, X., Peng, N., Zhao, D., & Yan, R. (2018, February). Style transfer in text: Exploration and evaluation. Paper presented at The Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA. Retrieved from https://arxiv.org/pdf/1711.06861.pdf
[4] Fun Translations [Computer software]. (2018). Retrieved from http://funtranslations.com/
[5] Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2414-2423). Retrieved from https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Gatys_Image_Style_Transfer_CVPR_2016_paper.pdf
[6] Karpathy, A. (2015, May 21). The unreasonable effectiveness of recurrent neural networks [Blog post]. Retrieved from Andrej Karpathy blog: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
[7] Kiros, R., Zhu, Y., Salakhutdinov, R. R., Zemel, R., Urtasun, R., Torralba, A., & Fidler, S. (2015). Skip-thought vectors. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 28, pp. 3294-3302). Retrieved from http://papers.nips.cc/paper/5950-skip-thought-vectors.pdf
[8] Newton, C. (2016, October 6). Speak, memory (J. Dzieza & M. Zelenko, Ed.). Retrieved January 25, 2018, from The Verge website: https://www.theverge.com/a/luka-artificial-intelligence-memorial-roman-mazurenko-bot
[9] Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global vectors for word representation. In Empirical methods in natural language processing (EMNLP) (pp. 1532-1543). Retrieved from http://www.aclweb.org/anthology/D14-1162
[10] Shen, T., Lei, T., Barzila, R., & Jaakola, T. (2017). Style transfer from non-parallel text by cross-alignment. Neural Information Processing Systems, 30. Retrieved from https://papers.nips.cc/paper/7259-style-transfer-from-non-parallel-text-by-cross-alignment.pdf
[11] Xu, W., Ritter, A., Dolan, W. B., Grishman, R., & Cherry, C. (2012). Paraphrasing for style. In Proceedings of COLING 2012: Technical Papers (pp. 2899-2914). Retrieved from http://www.aclweb.org/anthology/C12-1177