问题
I have to implement horizontal markovization (NLP concept) and I'm having a little trouble understanding what the trees will look like. I've been reading the Klein and Manning paper, but they don't explain what the trees with horizontal markovization of order 2 or order 3 will look like. Could someone shed some light on the algorithm and what the trees are SUPPOSED to look like? I'm relatively new to NLP.
回答1:
So, let's say you have a bunch of flat rules like:
NP
NNP
NNP
NNP
NNP
or
VP
V
Det
NP
When you binarize these you want to keep the context (i.e. this isn't just a Det but specifically a Det following a Verb as part of a VP). To do so normally you use annotations like this:
NP
NNP
NP->NNP
NNP
NP->NNP->NNP
NNP
NP->NNP->NNP->NNP
NNP
or
VP
V
VP->V
Det
VP->V->Det
NP
You need to binarize the tree, but these annotations are not always very meaningful. They might be somewhat meaningful for the Verb Phrase example, but all you really care about for the other one is that a noun phrase can be a fairly long string of proper nouns (e.g. "Peter B. Lewis Building" or "Hope Memorial Bridge Project Anniversary"). So with Horizontal Markovization you will collapse some of the annotations slightly, throwing away some of the context. The order of Markovization is the amount of context you are going to retain. So with the normal annotations you are basically at infinite order: choosing to retain all context and collapse nothing.
Order 0 means you're going to drop all of the context and you get a tree without the fancy annotations, like this:
NP
NNP
NNP
NNP
NNP
NNP
NNP
NNP
Order 1 means you'll retain only one term of context and you get a tree like this:
NP
NNP
NP->...NNP **one term: NP->**
NNP
NP->...NNP **one term: NP->**
NNP
NP->...NNP **one term: NP->**
NNP
Order 2 means you'll retain two terms of context and you get a tree like this:
NP
NNP
NP->NNP **two terms: NP->NNP**
NNP
NP->NNP->...NNP **two terms: NP->NNP->**
NNP
NP->NNP->...NNP **two terms: NP->NNP->**
NNP
回答2:
I believe the idea is to take into account parent nodes for vertical markovization and sibling nodes for horizontal when estimating rule probabilities, and the order indicates how many of them are included. There's a nice picture for parent annotation here.
Also, a quote from http://www.timothytliu.com/files/NLPAssignment5.pdf:
To approach lexicalization, more information is added onto the parent nodes of each tree. This correctly differentiates between different attachments and whether or not to branch left or branch right. Horizontal Markovization is accomplished by keeping track of siblings as the tree is binarized. Vertical Markovization is accomplished by keeping track of the parents of the node in the tree. These create new dependencies, as now the rules are a combination of both depth and breadth.
来源:https://stackoverflow.com/questions/12884411/horizontal-markovization