Breaking string into multiple lines according to character width (python)

问题

I am drawing text atop a base image via PIL. One of the requirements is for it to overflow to the next line(s) if the combined width of all characters exceeds the width of the base image.

Currently I'm using textwrap.wrap(text, width=16) to accomplish this. Here width defines the number of characters to accommodate in one line. Now the text can be anything since it's user generated. So the problem is that hard-coding width won't take into account width variability due to font type, font size and character selection.

What do I mean?

Well imagine I'm using DejaVuSans.ttf, size 14. A W is 14 in length, whereas an 'i' is 4. For a base image of width 400, up to 100 i characters can be accommodated in a single line. But only 29 W characters. I need to formulate a smarter way of wrapping to the next line, one where the string is broken when the sum of character-widths exceeds the base image width.

Can someone help me formulate this? An illustrative example would be great!

回答1:

Since you know the width of each character, you should make that into a dictionary, from which you get the widths to calculate the stringwidth:

char_widths = {
    'a': 9,
    'b': 11,
    'c': 13,
    # ...and so on
}

From here you can lookup each letter and use that sum to check your width:

current_width = sum([char_widths[letter] for letter in word])

回答2:

If precision matters for you, the best way to get the real text width is to actually render it, since font metrics are not always linear, regarding the kerning or the font size (see here) for example, and therefore not easily predictable. We can approach the optimal breakpoint with ImageFont method get_size that internally uses core font rendering methods (see PIL github)

def break_text(txt, font, max_width):

    # We share the subset to remember the last finest guess over 
    # the text breakpoint and make it faster
    subset = len(txt)
    letter_size = None

    text_size = len(txt)
    while text_size > 0:

        # Let's find the appropriate subset size
        while True:
            width, height = font.getsize(txt[:subset])
            letter_size = width / subset

            # min/max(..., subset +/- 1) are to avoid looping infinitely over a wrong value
            if width < max_width - letter_size and text_size >= subset: # Too short
                subset = max(int(max_width * subset / width), subset + 1)
            elif width > max_width: # Too large
                subset = min(int(max_width * subset / width), subset - 1)
            else: # Subset fits, we exit
                break

        yield txt[:subset]
        txt = txt[subset:]   
        text_size = len(txt)

and use it like so:

from PIL import Image
from PIL import ImageFont
img = Image.new('RGBA', (100, 100), (255,255,255,0))
draw = ImageDraw.Draw(img)
font = ImageFont.truetype("Helvetica", 12)
text = "This is a sample text to break because it is too long for the image"

for i, line in enumerate(break_text(text, font, 100)):
    draw.text((0, 16*i), line, (255,255,255), font=font)

来源：https://stackoverflow.com/questions/43827756/breaking-string-into-multiple-lines-according-to-character-width-python

标签

python

word-wrap

python-textprocessing