Creating a New Synactic Parser?

Chain of thoughts behind the idea

Working on my last project on claim detection in online text I hypothesized using syntactic information in addition to semantic information given the proven and claimed importance of sentence structure in formation of a claim. While generating POS and dependency trees I realised a gap we have so blissfully been ignorant of.

Before the advent of Deep Learning major NLP tasks used syntactic parsers. They produced parsings like POS tags, Dependency trees and Constituency trees. With the rise of Deep Learning more tasks could be magically solved. These black boxes preceeded by embedding methods like BERT need meaningful inputs like sentences. Now we have a tree that contains a lot of useful syntactic information that is inherently important for any machine to learn natural language AND we have this revolutionary black box (DL) BUT we do not have anything that can connect the two.

This makes me question if dependency and constituency trees are the best way to represent syntactic information for the coming age deep learning algorithms?

Is there any other way to encapsulate the subject-verb-object agreement, and the grammar and sentence formation notion in a way that is meaningful for BERT/XLNet and consequently CNNs/LSTMs.

Maybe what we are looking for is a new form of syntactic parsing that is built on the past ideas and can also capture the sentence structure of gen-z lingo that exists online in a machine interpretable way?

Maybe there already exist some parsing, for english or a low-resource language, that has accidentally solved this problem and is hence applicable for this chain of thought?

If this makes you think or if there is something I’ve missed, pick my brain at shreyag [at] iiitd [dot] ac [dot] in

Shreya Gupta
Shreya Gupta
Research Associate

Aim to understand the world behind the three lines of code (import, train, test), challenge conventional approaches and build more efficient and applicable algorithms