But what if it was possible to create new episodes ourselves based on what was already out there? No, this did not suddenly turn into a fanscript site believe me. What we are talking about is giving an AI encyclopedic levels of knowledge about a particular topic, then asking it to generate new text in those styles. Still not a fanscript site. Neural Networks excel at processing enormous amounts of data and locating patterns within them. So when we have a large amount of known good source information such as spending habits at stores, driving patterns for Los Angeles, or 30+ seasons of a particularly popular television show, it is possible to try to ask the right questions of a piece of software and then see what it can come up with to produce something new based on what it found. What we’re going to go over today is what it would take to Generate Text using Machine Learning (ML).
How would we Generate Text using ML?
You may have seen references in various locations saying “I fed a thousand hours of
Why would we Generate Text using ML?
Say for instance that you had a particular instructor that you could always learn more easily from. With enough time and reference material, it would be possible to produce a learning plan tuned exactly to your requirements based on any topic in exactly the medium that would be most effective for you. Chatbots such as those used in numerous technical support situations could react in a far more human manner. On the flip side however, this could also be used to generate malicious messages that the person would never write, but could be incredibly difficult to distinguish from a genuine article. If someone gained access to an executive’s sent items folder and was able to copy all of that data, with enough time it could theoretically be used to write a very convincing phishing email as if it was from that person. This would also become all the more frightening when combined with other technologies such as Deepfakes. In fact in 2016 there was a proof of concept piece of software from Adobe shown at their Adobe MAX presentation known as Project Voco. It was called at the time “Photoshop for Voice” and showed at least the potential of being able to synthesize a person’s voice based on words and short phrases close enough to be indistinguishable if you weren’t listening for it. With a good enough source document to start from, it would effectively sound like the person to a decent degree. So on a positive note, this would allow news organizations to have anchors effectively available around the clock without the person becoming a zombie in the process. On the other hand, they could also be used to incredible detriment in politics and legal proceedings.
Conclusion
This technology has a huge amount of potential, and will continue to grow leaps and bounds in the coming years. It absolutely has both legitimate and malicious uses and is going to require a considerable re-think on how users are trained to spot potential threats. As it is now, only the most sophisticated spear-phishing attempts are able to completely convince the target that they are someone else. Combine this with falsified senders and very little else to say that it is not who they claim to be, and it would open up a whole new can of worms for Information Security. We will absolutely want to be vigilant for new breakthroughs in this field for the foreseeable future if we want to be able to protect our organizations, our users and ourselves from potential attack vectors.
Sources
Lyrics.rip “Generate Lyrics – All magic done by a Markov Chain” – https://www.lyrics.rip/ Machine Learning has revealed exactly how much of a Shakespeare play was written by someone else – https://www.technologyreview.com/2019/11/22/131857/machine-learning-has-revealed-exactly-how-much-of-a-shakespeare-play-was-written-by-someone/ Relative contributions of Shakespeare and Fletcher in Henry VIII: An Analysis Based on Most Frequent Words and Most Frequent Rhythmic Patterns – https://arxiv.org/abs/1911.05652 Machine Learning has revealed exactly how much of a Shakespeare play was written by someone else – https://www.technologyreview.com/2019/11/22/131857/machine-learning-has-revealed-exactly-how-much-of-a-shakespeare-play-was-written-by-someone/ Machine Learning with (Monty) Python – https://www.linkedin.com/pulse/machine-learning-monty-python-marco-marchesi Can AI write like Shakespeare? – https://towardsdatascience.com/can-ai-write-like-shakespeare-de710befbfee Deepfakes – /topic/deepfake/ Adobe Audio Manipulator Sneak Peak with Jordan Peele | Adobe Creative Cloud – https://www.youtube.com/watch?v=I3l4XLZ59iw