I would approach the problem slightly differently. It's significant that "end" and "dependent" overlap, yet that is lost in your word Map. If instead of a single word map you were to create a set of word maps, with each one representing a possible segmentation of the column name, consisting only of non-overlapping words, you could compute a score for each segmentation based on the word's probabilities and word length. The score for a segmentation would be the average of the scores of the individual words in the segmentation. The score for an individual word would be some function of the word's length(l) and probability(p), something like
score=al + bp
where a and b are weights that you can tweak to get the right mix. Average the scores for each word to get the score for the segmentation and pick the segmentation with the highest score. The score function would not have to be a linear weighting either, you could experiment with logarithmic, exponential or higher-order polynomials (squares for instance)