Writing Arabic with PDFBOX with correct characters presentation form without being separated

后端 未结 2 1806
挽巷
挽巷 2020-11-29 10:14

I\'m trying to generate a PDF that contains Arabic text using PDFBox Apache but the text is generated as separated characters because Apache parses given Arabic string to a

相关标签:
2条回答
  • 2020-11-29 10:54

    Notice:

    The sample code in this answer might be outdated please refer to h q's answer for the working sample code


    At First I will thank Tilman Hausherr and M.Prokhorov for showing me the library that made writing Arabic possible using PDFBox Apache.

    This Answer will be divided into two Sections:

    1. Downloading the library and installing it
    2. How to use the library

    Downloading the library and installing it

    We are going to use ICU Library.
    ICU stands for International Components for Unicode and it is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software.

    To download the Library go to the downloads page from here.
    Choose the latest version of ICU4J as shown in the following image.

    You will be transferred to another page and you will find a box with direct links of the needed components .Go ahead and download three Files you will find the highlighted in next image.

    1. icu4j-docs.jar
    2. icu4j-src.jar
    3. icu4j.jar

    The following explanation for creating and adding a library in Netbeans IDE

    1. Navigate to the Toolbar and Click tools
    2. Choose Libraries
    3. At the bottom left you will find new Library button Create yours
    4. Navigate to the library that you created in libraries list
    5. Click it and add jar folders like that
    6. Add icu4j.jar in class path
    7. Add icu4j-src.jar in Sources
    8. Add icu4j-docs.jar in Javadoc
    9. View your opened projects from the very right
    10. Expand the project that you want to use the library in
    11. Right Click on the libraries folder and choose add library
    12. Finally choose the library that you had just created.

    Now you are ready to use the library just import what you want like that

    import com.ibm.icu.What_You_Want_To_Import;
    


    How to use the library

    With ArabicShaping Class and reversing the String we can write a correct attached Arabic LINE
    Here is the Code Notice the comments in the following code

    import com.ibm.icu.text.ArabicShaping;
    import com.ibm.icu.text.ArabicShapingException;
    import java.io.File;
    import java.io.IOException;
    import org.apache.pdfbox.pdmodel.PDDocument;
    import org.apache.pdfbox.pdmodel.PDPage;
    import org.apache.pdfbox.pdmodel.PDPageContentStream;
    import org.apache.pdfbox.pdmodel.font.*;
    
    public class Main {
        public static void main(String[] args) throws IOException , ArabicShapingException
    {
            File f = new File("Arabic Font File of format.ttf");
            PDDocument doc = new PDDocument();
            PDPage Page = new PDPage();
            doc.addPage(Page);
            PDPageContentStream Writer = new PDPageContentStream(doc, Page);
            Writer.beginText();
            Writer.setFont(PDType0Font.load(doc, f), 20);
            Writer.newLineAtOffset(0, 700);
            //The Trick in the next Line of Code But Here is some few Notes first  
            //We have to reverse the string because PDFBox is Writting from the left but Arabic is RTL Language  
            //The output will be perfect except every line will be justified to the left "It's not hard to resolve this"
            // So we have to write arabic string to pdf line by line..It will be like this
            String s ="جملة بالعربي لتجربة الكلاس اللذي يساعد علي وصل الحروف بشكل صحيح";
            Writer.showText(new StringBuilder(new ArabicShaping(reverseNumbersInString(ArabicShaping.LETTERS_SHAPE).shape(s))).reverse().toString());
            // Note the previous line of code throws ArabicShapingExcpetion 
            Writer.endText();
            Writer.close();
            doc.save(new File("File_Test.pdf"));
            doc.close();
        }
    }
    

    Here is the output

    I hope that I had gone over everything.

    Update : After reversing make sure to reverse the numbers again in order to get the same proper number
    Here is a couple of functions that could help

    public static boolean isInt(String Input)
    {
        try{Integer.parseInt(Input);return true;}
        catch(NumberFormatException e){return false;}
    }
    public static String reverseNumbersInString(String Input)
    {
        char[] Separated = Input.toCharArray();int i = 0;
        String Result = "",Hold = "";
        for(;i<Separated.length;i++ )
        {
            if(isInt(Separated[i]+"") == true)
            {
                while(i < Separated.length && (isInt(Separated[i]+"") == true ||  Separated[i] == '.' ||  Separated[i] == '-'))
                {
                    Hold += Separated[i];
                    i++;
                }
                Result+=reverse(Hold);
                Hold="";
            }
            else{Result+=Separated[i];}
        }
        return Result;
    }
    
    0 讨论(0)
  • 2020-11-29 11:09

    Here is a code that works. Download a sample font, e.g. trado.ttf

    Make sure the pdfbox-app and icu4j jar files are in your classpath.

    import java.io.File;
    import java.io.IOException;
    
    import com.ibm.icu.text.ArabicShaping;
    import com.ibm.icu.text.ArabicShapingException;
    import com.ibm.icu.text.Bidi;
    
    import org.apache.pdfbox.pdmodel.PDDocument;
    import org.apache.pdfbox.pdmodel.PDPage;
    import org.apache.pdfbox.pdmodel.PDPageContentStream;
    import org.apache.pdfbox.pdmodel.font.*;
    
    public class Main {
        public static void main(String[] args) throws IOException , ArabicShapingException
        {
        File f = new File("trado.ttf");
            PDDocument doc = new PDDocument();
            PDPage Page = new PDPage();
            doc.addPage(Page);
            PDPageContentStream Writer = new PDPageContentStream(doc, Page);
            Writer.beginText();
            Writer.setFont(PDType0Font.load(doc, f), 20);
            Writer.newLineAtOffset(0, 700);
            String s ="جملة بالعربي لتجربة الكلاس اللذي يساعد علي وصل الحروف بشكل صحيح";
            Writer.showText(bidiReorder(s));
            Writer.endText();
            Writer.close();
            doc.save(new File("File_Test.pdf"));
            doc.close();
        }
    
        private static String bidiReorder(String text)
        {
            try {
            Bidi bidi = new Bidi((new ArabicShaping(ArabicShaping.LETTERS_SHAPE)).shape(text), 127);
                bidi.setReorderingMode(0);
                return bidi.writeReordered(2);
            }
            catch (ArabicShapingException ase3) {
            return text;
        }
        }
    
    }
    
    0 讨论(0)
提交回复
热议问题