The next literary hit to bear the same name as Jane Austen's 200-year-old masterpiece Emma isn't a book at all, but a thoroughly modern AI.
This Emma — formally Emma Identity — the creation of computer science professor Aleksandr Marchenko, is the world's first publicly available authorship identification web app. Give her at least 5000 words of a written piece, and she'll use more than 50 maths parameters to figure out who wrote the lines.
According to her developers, she offers 85 per cent accuracy, the highest percentage engineers have been able to achieve.
Marchenko told Digital Trends that "the technology combines natural language processing (NLP) and machine learning with the techniques of stylometry, a study of linguistic style rooted in the 15th century".
Be honest — how often do you think about the font you're using? You're exposed to words in all shapes and sizes on a daily basis, whether texting a friend, writing a resume, or jotting notes in your text editor of choice. But fonts convey different meanings, and if you pick the wrong one you could find yourself either dismissed for a potential job opportunity, or worse, laughed at by everyone in the office before they throw your resume in the trash.
While the list of potential users ranges from journalists and lawyers to historians and the FBI (perhaps to track down the writers of blackmail notes), one of the primary target audiences is educators. Teachers and professors can use Emma's skills to determine plagiarism they may suspect in student assignments.
Considering that a variety of studies over the past 10 years have found that anywhere between one-third to one-half of high school students have admitted to plagiarism, a tool like Emma could make a big impact in classrooms.
Also, it's a step up from the basic cut-and-paste Google search that many lower-level teachers still rely on, or even detection software such as TurnItIn, one of the most popular plagiarism detection products across the world.
Constant complaints by educators, and software efficacy researchers such as HTW Berlin, about the software currently on the market have been related to the technology being able to determine text parallels through scanning text, but not actual plagiarism.
Emma's capability to determine authorship with such high accuracy should decrease the number of false negatives and positives.
If you want to test Emma's capabilities, she's currently in beta and you can try to trick her; first, you teach her an original text, and then run other texts against it.
She just has one rule: No cheating.