How can I retrieve images from a .pptx file using MS Open XML SDK?

谁都会走 提交于 2019-12-30 09:32:27

问题


I started experimenting with Open XML SDK 2.0 for Microsoft Office.

I'm currently able to do certain things such as retrieve all texts in each slide, and get the size of the presentation. For example, I do the latter this way:

using (var doc = PresentationDocument.Open(pptx_filename, false)) {
     var presentation = doc.PresentationPart.Presentation;

     Debug.Print("width: " + (presentation.SlideSize.Cx / 9525.0).ToString());
     Debug.Print("height: " + (presentation.SlideSize.Cy / 9525.0).ToString());
}

Now I'd like to retrieve embedded images in a given slide. Does anyone know how to do this or can point me to some docs on the subject?


回答1:


First you need to grab the SlidePart in which you want to get the images from:

public static SlidePart GetSlidePart(PresentationDocument presentationDocument, int slideIndex)
{
    if (presentationDocument == null)
    {
        throw new ArgumentNullException("presentationDocument", "GetSlidePart Method: parameter presentationDocument is null");
    }

    // Get the number of slides in the presentation
    int slidesCount = CountSlides(presentationDocument);

    if (slideIndex < 0 || slideIndex >= slidesCount)
    {
        throw new ArgumentOutOfRangeException("slideIndex", "GetSlidePart Method: parameter slideIndex is out of range");
    }

    PresentationPart presentationPart = presentationDocument.PresentationPart;

    // Verify that the presentation part and presentation exist.
    if (presentationPart != null && presentationPart.Presentation != null)
    {
        Presentation presentation = presentationPart.Presentation;

        if (presentation.SlideIdList != null)
        {
            // Get the collection of slide IDs from the slide ID list.
            var slideIds = presentation.SlideIdList.ChildElements;

            if (slideIndex < slideIds.Count)
            {
               // Get the relationship ID of the slide.
               string slidePartRelationshipId = (slideIds[slideIndex] as SlideId).RelationshipId;

                // Get the specified slide part from the relationship ID.
                SlidePart slidePart = (SlidePart)presentationPart.GetPartById(slidePartRelationshipId);

                 return slidePart;
             }
         }
     }

     // No slide found
     return null;
}

Then you need to search for the Picture object which will contain the image you are looking for based on the file name of the image:

Picture imageToRemove = slidePart.Slide.Descendants<Picture>().SingleOrDefault(picture => picture.NonVisualPictureProperties.OuterXml.Contains(imageFileName));



回答2:


Simplest way of getting Images from Openxml formats:

Use any zip archive library to extract images from media folder of the pptx file. This will contain the images in the document. Similarly, you can manually replace extension .pptx into .zip and extract to get images from media folder.

Hope this helps.



来源:https://stackoverflow.com/questions/7070074/how-can-i-retrieve-images-from-a-pptx-file-using-ms-open-xml-sdk

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!