Why NLP Is the Crux of AI
And why without it, the entire concept of AI doesn't make much sense.
Natural language processing (NLP) is a branch of data science that aims to bridge the gap between human communication and computer understanding. Without a satisfactory natural language processing, Google would never know whether you want to see the novelties of agriculture or hot gadgets when you search for “new apple arrivals.” Without it, Siri, Alexa, Cortana, and Bixby couldn’t assist you with anything.
Yet what I will try to argue about here is that without an adequate NLP the entire concept of AI is impossible (or at least doesn’t make much sense).
Robots are after our jobs...
AI is something we’ve learned to hope for and dread. Yet love it or hate it, it’s not arriving soon. With all the debates, controversies and clickbait headlines in the press, one might naïvely assume that AI is here or at least is bound to emerge in the nearest future.
Especially vocal are people concerned with privacy issues and with unemployment AI will supposedly create. Finding a balance between privacy and security is too complex an issue, so I’ll leave it for some other time.
As for the alleged unemployment, need I remind you of Luddites or ferry boatmen who protested every single bridge that was built in the history?
Every new technology that improves our lives by eliminating some of the menial and repetitive tasks wipes out some jobs. Yet it creates other jobs in its place. Would you like an army of librarians to look for information for you instead of Google (making this search quite an expensive feat)? I doubt that. Why would you then lament the demise of typists, cashiers, bank clerks and shop assistants—at least as we know them? Why would people want to be typists instead of, say, writers and developers? Creative jobs aren’t threatened by the impending advent of AI, nor are the jobs that require human-specific skills. People will still cater to other people’s needs, only they will do the work that is more rewarding—more fulfilling and financially lucrative.
... but they are actually quite dumb
However, machine learning is not even an AI yet. It’s a promising technology, but to be accurate it needs humongous datasets processed by humans—millions upon millions of instances when the algorithm’s suggestions were approved or rejected by actual people. It is merely a tool to handle data. And to hone this tool we will need crowd-sourcing to collect the data and teams of data scientists who will crunch these data.
Spam-filters, chat-bots, personal assistants, media content recommendations on streaming services, algorithms that match you with writers for research papers help, plagiarism checkers, and online editors— they all are possible thanks to NLP and can only become better in the future if we will be able to improve NLP technology. Yet we’re only making first baby steps towards their efficiency.
Here is an example from real life. I submit posts to renowned sites like Medium, The Innovation Enterprise, Thrive Global and others. Naturally, I am one of many, many writers, and the volume of stories is staggering. Of course, editors seek to optimize their work and use automation to vet as many submissions as possible before they reach a human editor. This automation must be to an extent efficient.
However, I more than once got my submissions rejected on the grounds of “Too many grammatical issues” only because my story contained one or two slang words that made perfect sense in the context. At least, they would to a human, but they didn’t to the algorithms. They failed to find the words in their memory banks (aka dictionaries), so they ruled an error. Moreover, as the words in question were non-words to the algorithms, they created lacunae in the syntactic structure, making the sentenced incomplete following the logic of the machine. I know as much because my posts were accepted later, once I removed the words in question and resubmitted the texts otherwise intact.
I know that this evidence is anecdotal, but I face similar issues frequently while I use various automatic editors available online. Sometimes they demonstrate surprising finesse in detecting errors that usually go under the radar of most non-machine learning static spell-checkers (typos: form vs. from, quite vs. quiet, etc.) Yet sometimes those algorithms fail miserably to detect the subject of the sentence and jump to the wrong conclusions, suggesting, quite frankly, nonsensical corrections. Sometimes they even suggest two contradictory corrections, creating the infinity loop.
Automation does wonders. However, for most of those wonders to work, we must first get a clever human to write code and foresee every possible option. If some real-life scenario doesn’t have a prescribed solution, embedded into the algorithm, the machine is dumb and helpless as a newborn.
When they will learn to speak, though...
How complex the tasks a future AI could tackle depends directly on how good it would be at eliciting information from natural language.
- Personal assistants would stop being frustratingly straightforward and could process and understand a request like “I want to go out with a friend after work today, take care of that.” That’s how humans communicate. Not “restaurant in Seattle, book a table, May 22, 6PM”.
- Rather than groping for precise keywords, users could use voice search to look for things and information in a natural way by asking questions. Currently, if you ask a question, the search engine will match you with similar questions, not with the answers. You are lucky if the question happens to be satisfactorily answered by other users in the tread. If not—tough luck, go figure the keywords to look for.
- Algorithms will be able to elicit valuable insights from the unstructured data, such as videos, audios, books and other long texts. On top of summarizing, they will analyze the tone of voice, choice of words, emotions, and connotations and add this data to their output. This will do wonders for healthcare, social studies, and everyday customer service.
- Self-driving vehicles could also be improved drastically if they understood speech. For example, instead of programming your GPS navigator with the exact coordinates, you would be able to say: “Stop at a drive-in on the way home” or “First, I want to pick up my daughter from school.”
- Those scenarios are exactly what we imagine when we are asked to picture a future in the world of AI technology. Yet until we perfect the natural language processing technology, we are bound to adjust the way we think and speak to the way algorithms do, which is much less promising and much more dystopian prospect.