Image to ASCII art conversion

谁都会走 提交于 2019-11-26 15:34:38

There are more approaches for image to ASCII art conversion which are mostly based on using mono-spaced fonts for simplicity I stick only to basics:

pixel/area intensity based (Shading)

This approach handles each pixel of area of pixels as single dot. The idea is to compute the average gray scale intensity of this dot and then replace it with character with close enough intensity to the computed one. For that we need some list of usable characters each with precomputed intensity let call it character map. To choose more quickly which character is the best for which intensity there are two ways:

  1. linearly distributed intensity character map

    So we use only characters which have intensity difference with the same step. In other words when sorted ascending then:

    intensity_of(map[i])=intensity_of(map[i-1])+constant;
    

    Also when our character map is sorted then we can compute the character directly from intensity (no search needed)

    character=map[intensity_of(dot)/constant];
    
  2. arbitrary distributed intensity character map

    So we have array of usable characters and their intensities. We need to find intensity closest to the intensity_of(dot) So again if we sorted the map[] we can use binary search otherwise we need O(n) search min distance loop or O(1) dictionary. Sometimes for simplicity the character map[] can be handled as linearly distributed causing slight gamma distortion usually unseen in the result unless you know what to look for.

Intensity based conversion is great also for gray-scale images (not just black and white). If you select the dot as a single pixel the result gets large (1 pixel -> single character) so for larger images an area (multiply of font size) is selected instead to preserve aspect ratio and do not enlarge too much.

How to do it:

  1. so evenly divide image to (gray-scale)pixels or (rectangular) areas dot's
  2. compute the intensity of each pixel/area
  3. replace it by character from character map with the closest intensity

As character map you can use any characters but the result gets better if the character has pixels dispersed evenly along the character area. For starters you can use:

  • char map[10]=" .,:;ox%#@";

sorted descending and pretend to be linearly distributed.

So if intensity of pixel/area is i = <0-255> then the replacement character will be

  • map[(255-i)*10/256];

if i==0 then the pixel/area is black, if i==127 then the pixel/area is gray and if i==255 then the pixel/area is white. You can experiment with different characters inside map[] ...

Here ancient example of mine in C++ and VCL:

AnsiString m=" .,:;ox%#@";
Graphics::TBitmap *bmp=new Graphics::TBitmap;
bmp->LoadFromFile("pic.bmp");
bmp->HandleType=bmDIB;
bmp->PixelFormat=pf24bit;

int x,y,i,c,l;
BYTE *p;
AnsiString s,endl;
endl=char(13); endl+=char(10);
l=m.Length();
s="";
for (y=0;y<bmp->Height;y++)
    {
    p=(BYTE*)bmp->ScanLine[y];
    for (x=0;x<bmp->Width;x++)
        {
        i =p[x+x+x+0];
        i+=p[x+x+x+1];
        i+=p[x+x+x+2];
        i=(i*l)/768;
        s+=m[l-i];
        }
    s+=endl;
    }
mm_log->Lines->Text=s;
mm_log->Lines->SaveToFile("pic.txt");
delete bmp;

you need to replace/ignore VCL stuff unless you use Borland/Embarcadero environment

  • mm_log is memo where the text is outputted
  • bmp is input bitmap
  • AnsiString is VCL type string indexed form 1 not from 0 as char* !!!

this is the result: Slightly NSFW intensity example image

On the left is ASCII art output (font size 5px), and on the right input image Zoomed few times. As you can see the output is larger pixel -> character. if you use larger areas instead of pixels then the zoom is smaller but of course the output is less visually pleasing. This approach is very easy and fast to code/process.

When you add more advanced things like:

  • automated map computations
  • automatic pixel/area size selection
  • aspect ratio corrections

Then you can process more complex images with better results:

here result in 1:1 ratio (zoom to see the characters):

Of course for area sampling you lose the small details. This is image of the same size as the first example sampled with areas:

Slightly NSFW intensity advanced example image

As you can see this is more suited for bigger images

Character fitting (hybrid between Shading and Solid ASCII Art)

This approach tries to replace area (no more single pixel dots) with character with similar intensity and shape. This lead to better results even with bigger fonts used in comparison with previous approach on the other hand this approach is a bit slower of course. There are more ways to do this but the main idea is to compute the difference (distance) between image area (dot) and rendered character. You can start with naive sum of abs difference between pixels but that will lead to not very good results because even a 1 pixel shift will make the distance big, instead you can use correlation or different metrics. The overall algorithm is the almost the same as previous approach:

  1. so evenly divide image to (gray-scale) rectangular areas dot's
    • ideally with the same aspect ratio as rendered font characters (it will preserve aspect ratio, do not forget that characters usually overlap a bit in x axis)
  2. compute the intensity of each area (dot)
  3. replace it by character from character map with the closest intensity/shape

How to compute distance between character and dot? That is the hardest part of this approach. While experimenting I develop this compromise between speed, quality, and simpleness:

  1. Divide character area to zones

    • compute separate intensity for left, right, up, down, and center zone of each character from your conversion alphabet (map)
    • normalize all intensities so they are independent on area size i=(i*256)/(xs*ys)
  2. process source image in rectangle areas

    • (with the same aspect ratio as target Font)
    • for each area compute intensity in the same manner as in bullet 1
    • find the closest match from intensities in conversion alphabet
    • output fitted character

This is result for font size = 7px

As you can see the output is visually pleasing even with bigger font size used (the previous approach example was with 5px font size). The output is roughly the same size as input image (no zoom). The better results are achieved because the characters are closer to original image not only by intensity but also by overall shape and therefore you can use larger fonts and still preserving details (up to a point of coarse).

Here complete code for the VCL based conversion app:

//---------------------------------------------------------------------------
#include <vcl.h>
#pragma hdrstop

#include "win_main.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
Graphics::TBitmap *bmp=new Graphics::TBitmap;
//---------------------------------------------------------------------------
class intensity
    {
public:
    char c;                 // character
    int il,ir,iu,id,ic;     // intensity of part: left,right,up,down,center
    intensity() { c=0; reset(); }
    void reset() { il=0; ir=0; iu=0; id=0; ic=0; }
    void compute(DWORD **p,int xs,int ys,int xx,int yy) // p source image, (xs,ys) area size, (xx,yy) area position
        {
        int x0=xs>>2,y0=ys>>2;
        int x1=xs-x0,y1=ys-y0;
        int x,y,i;
        reset();
        for (y=0;y<ys;y++)
         for (x=0;x<xs;x++)
            {
            i=(p[yy+y][xx+x]&255);
            if (x<=x0) il+=i;
            if (x>=x1) ir+=i;
            if (y<=x0) iu+=i;
            if (y>=x1) id+=i;
            if ((x>=x0)&&(x<=x1)
              &&(y>=y0)&&(y<=y1)) ic+=i;
            }
        // normalize
        i=xs*ys;
        il=(il<<8)/i;
        ir=(ir<<8)/i;
        iu=(iu<<8)/i;
        id=(id<<8)/i;
        ic=(ic<<8)/i;
        }
    };
//---------------------------------------------------------------------------
AnsiString bmp2txt_big(Graphics::TBitmap *bmp,TFont *font) // charcter sized areas
    {
    int i,i0,d,d0;
    int xs,ys,xf,yf,x,xx,y,yy;
    DWORD **p=NULL,**q=NULL;    // bitmap direct pixel access
    Graphics::TBitmap *tmp;     // temp bitmap for single character
    AnsiString txt="";          // output ASCII art text
    AnsiString eol="\r\n";      // end of line sequence
    intensity map[97];          // character map
    intensity gfx;

    // input image size
    xs=bmp->Width;
    ys=bmp->Height;
    // output font size
    xf=font->Size;   if (xf<0) xf=-xf;
    yf=font->Height; if (yf<0) yf=-yf;
    for (;;) // loop to simplify the dynamic allocation error handling
        {
        // allocate and init buffers
        tmp=new Graphics::TBitmap; if (tmp==NULL) break;
            // allow 32bit pixel access as DWORD/int pointer
            tmp->HandleType=bmDIB;    bmp->HandleType=bmDIB;
            tmp->PixelFormat=pf32bit; bmp->PixelFormat=pf32bit;
            // copy target font properties to tmp
            tmp->Canvas->Font->Assign(font);
            tmp->SetSize(xf,yf);
            tmp->Canvas->Font ->Color=clBlack;
            tmp->Canvas->Pen  ->Color=clWhite;
            tmp->Canvas->Brush->Color=clWhite;
            xf=tmp->Width;
            yf=tmp->Height;
        // direct pixel access to bitmaps
        p  =new DWORD*[ys];        if (p  ==NULL) break; for (y=0;y<ys;y++) p[y]=(DWORD*)bmp->ScanLine[y];
        q  =new DWORD*[yf];        if (q  ==NULL) break; for (y=0;y<yf;y++) q[y]=(DWORD*)tmp->ScanLine[y];
        // create character map
        for (x=0,d=32;d<128;d++,x++)
            {
            map[x].c=char(DWORD(d));
            // clear tmp
            tmp->Canvas->FillRect(TRect(0,0,xf,yf));
            // render tested character to tmp
            tmp->Canvas->TextOutA(0,0,map[x].c);
            // compute intensity
            map[x].compute(q,xf,yf,0,0);
            } map[x].c=0;
        // loop through image by zoomed character size step
        xf-=xf/3; // characters are usually overlaping by 1/3
        xs-=xs%xf;
        ys-=ys%yf;
        for (y=0;y<ys;y+=yf,txt+=eol)
         for (x=0;x<xs;x+=xf)
            {
            // compute intensity
            gfx.compute(p,xf,yf,x,y);
            // find closest match in map[]
            i0=0; d0=-1;
            for (i=0;map[i].c;i++)
                {
                d=abs(map[i].il-gfx.il)
                 +abs(map[i].ir-gfx.ir)
                 +abs(map[i].iu-gfx.iu)
                 +abs(map[i].id-gfx.id)
                 +abs(map[i].ic-gfx.ic);
                if ((d0<0)||(d0>d)) { d0=d; i0=i; }
                }
            // add fitted character to output
            txt+=map[i0].c;
            }
        break;
        }
    // free buffers
    if (tmp) delete tmp;
    if (p  ) delete[] p;
    return txt;
    }
//---------------------------------------------------------------------------
AnsiString bmp2txt_small(Graphics::TBitmap *bmp)    // pixel sized areas
    {
    AnsiString m=" `'.,:;i+o*%&$#@"; // constant character map
    int x,y,i,c,l;
    BYTE *p;
    AnsiString txt="",eol="\r\n";
    l=m.Length();
    bmp->HandleType=bmDIB;
    bmp->PixelFormat=pf32bit;
    for (y=0;y<bmp->Height;y++)
        {
        p=(BYTE*)bmp->ScanLine[y];
        for (x=0;x<bmp->Width;x++)
            {
            i =p[(x<<2)+0];
            i+=p[(x<<2)+1];
            i+=p[(x<<2)+2];
            i=(i*l)/768;
            txt+=m[l-i];
            }
        txt+=eol;
        }
    return txt;
    }
//---------------------------------------------------------------------------
void update()
    {
    int x0,x1,y0,y1,i,l;
    x0=bmp->Width;
    y0=bmp->Height;
    if ((x0<64)||(y0<64)) Form1->mm_txt->Text=bmp2txt_small(bmp);
     else                 Form1->mm_txt->Text=bmp2txt_big  (bmp,Form1->mm_txt->Font);
    Form1->mm_txt->Lines->SaveToFile("pic.txt");
    for (x1=0,i=1,l=Form1->mm_txt->Text.Length();i<=l;i++) if (Form1->mm_txt->Text[i]==13) { x1=i-1; break; }
    for (y1=0,i=1,l=Form1->mm_txt->Text.Length();i<=l;i++) if (Form1->mm_txt->Text[i]==13) y1++;
    x1*=abs(Form1->mm_txt->Font->Size);
    y1*=abs(Form1->mm_txt->Font->Height);
    if (y0<y1) y0=y1; x0+=x1+48;
    Form1->ClientWidth=x0;
    Form1->ClientHeight=y0;
    Form1->Caption=AnsiString().sprintf("Picture -> Text ( Font %ix%i )",abs(Form1->mm_txt->Font->Size),abs(Form1->mm_txt->Font->Height));
    }
//---------------------------------------------------------------------------
void draw()
    {
    Form1->ptb_gfx->Canvas->Draw(0,0,bmp);
    }
//---------------------------------------------------------------------------
void load(AnsiString name)
    {
    bmp->LoadFromFile(name);
    bmp->HandleType=bmDIB;
    bmp->PixelFormat=pf32bit;
    Form1->ptb_gfx->Width=bmp->Width;
    Form1->ClientHeight=bmp->Height;
    Form1->ClientWidth=(bmp->Width<<1)+32;
    }
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner):TForm(Owner)
    {
    load("pic.bmp");
    update();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormDestroy(TObject *Sender)
    {
    delete bmp;
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormPaint(TObject *Sender)
    {
    draw();
    }
//---------------------------------------------------------------------------
void __fastcall TForm1::FormMouseWheel(TObject *Sender, TShiftState Shift,int WheelDelta, TPoint &MousePos, bool &Handled)
    {
    int s=abs(mm_txt->Font->Size);
    if (WheelDelta<0) s--;
    if (WheelDelta>0) s++;
    mm_txt->Font->Size=s;
    update();
    }
//---------------------------------------------------------------------------

It is simple form app (Form1) with single TMemo mm_txt in it. It loads image "pic.bmp", then according to resolution choose which approach to use to converts to text which is saved to "pic.txt" and sent to memo to visualize. For those without VCL ignore the VCL stuff and replace AnsiString with any string type you have, and also the Graphics::TBitmap with any bitmap or image class you have at disposal with pixel access capability.

Very important note is that this uses the settings of mm_txt->Font so make sure you set:

  • Font->Pitch=fpFixed
  • Font->Charset=OEM_CHARSET
  • Font->Name="System"

to make this work properly otherwise the font will not be handled as mono-spaced. Mouse wheel just changes the font size up/down to see results on different font sizes

[Notes]

  • see Word Portraits visualization
  • use language with bitmap/file access and text output capabilities
  • strongly recommend to start with the first approach as it is very easy strait forward and simple, and only then move to the second (which can be done as modification of the first so most of the code stays as is anyway)
  • It is a good idea to compute with inverted intensity (black pixels is the max value) because standard text preview is on white background hence leading to much better results.
  • you can experiment with size,count and layout of the subdivision zones or use some grid like 3x3 instead.

[Edit1] comparison

Finally here is a comparison between the two approaches on the same input:

The green dot marked images are done with approach #2 and the red ones with #1 all on 6 pixel font size. As you can see on the Light bulb image the shape sensitive approach is much better (even if the #1 is done on 2x zoomed source image).

[Edit2] cool app

While reading todays new questions I got an Idea of an cool app that grabs selected region of desktop and continuously feed it to ASCIIart convertor and view the result. After an hour of coding it's done and I am so satisfied with the result that I simply must have to add it here.

OK the App consist from just 2 windows. The first master window is basically my old convertor window without the image selection and preview (all the stuff above is in it). It has just the ASCII preview and conversion settings. The second window is empty form with transparent inside for the grabbing area selection (no functionality whatsoever).

Now on timer I just grab the selected area by selection form, pass it to conversion and preview the ASCIIart.

So you enclose area you want to convert by the selection window and view the result in master window. It can be a game,viewer,... It looks like this:

So now I can watch even videos in ASCIIart for fun. Some are really nice :).

[Edit3]

If you want to try to implement this in GLSL take a look at this:

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!