Tracking Code Into a PDF or PostScript File

亡梦爱人 提交于 2019-12-01 18:08:41

问题


Is there a way to track when a PDF is opened? Perhaps by embedding some script into the pdf itself?

I saw the question below, and I suppose the answer is "no" for javascript, but I am wondering if this is possible at all.

Google analytics tracking code insert in pdf file


回答1:


The PDF standard includes support for JavaScript but as @Wes Hardaker pointed out, not every PDF reader supports it. However, sometimes some is better than none.

Here's Adobe's official Acrobat JavaScript Scripting Guide. What's probably most interesting to you is the doc object which has a method called getURL(). To use it you'd just call:

app.doc.getURL('http://www.google.com/');

Bind that event to the document's open event and you've got a tracker. I'm not too familiar with creating events from within Adobe Acrobat but from code its pretty easy. The code below is a full working VS2010 C# WinForms app that uses the open source library iTextSharp (5.1.1.0). It creates a PDF and adds the JavaScript to the open event.

Some notes: Adobe Acrobat and Reader will both warn the user whenever a document accesses an external resource. Most other PDF readers will probably do the same. This is very annoying so for this reason alone it shouldn't be done. Personally I don't care if someone tracks my document opens, I just don't want to get a prompt every time. Second, just to reiterate, this code works for Adobe Acrobat and Adobe Reader, probably as far back as at least V6, but may or may not work in other PDF readers. Third, there's no safe way to uniquely identify the user. Doing so would require you to create and store some equivalent of a "cookie" which would require you writing to the user's file system which would be considered unsafe. This means that you could only detect the number of opens, not unique opens. Fourth, this might not be legal everywhere. Some jurisdictions require that you notify users if you are tracking them and provide for a way for them to see what information you are collecting.

But with all of the above, I can't not give an answer just because I don't like it.

using System;
using System.Text;
using System.Windows.Forms;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;

namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            //File that we will create
            string OutputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Events.pdf");

            //Standard PDF creation setup
            using (FileStream fs = new FileStream(OutputFile, FileMode.Create, FileAccess.Write, FileShare.None))
            {
                using (Document doc = new Document(PageSize.LETTER))
                {
                    using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
                    {
                        //Open our document for writing
                        doc.Open();

                        //Create an action that points to the built-in app.doc object and calls the getURL method on it
                        PdfAction act = PdfAction.JavaScript("app.doc.getURL('http://www.google.com/');", writer);

                        //Set that action as the documents open action
                        writer.SetOpenAction(act);

                        //We need to add some content to this PDF to be valid
                        doc.Add(new Paragraph("Hello"));

                        //Close the document
                        doc.Close();
                    }
                }
            }

            this.Close();
        }
    }
}



回答2:


The problem with technologies like that is that they can never be absolute.

First, it's a security violation to trigger an external event and the software writers likely wouldn't support it (or, at least I hope not).

Second, its dependent on things like the network. What happens when someone downloads it and then reads it while offline on a plane, for example? You won't get the notification.

Third, there are multiple ways to read PDF files. Some people read them with readers you've likely not heard of (my favorite is a linux application that I like much better than the Adobe's AcroRead).

So even if you could do it (and I'd argue you shouldn't, but that's not answering your question), the real answer is "no" but even if the software supported it, it still wouldn't be reliable in the first place.




回答3:


Given that PostScript is a fully capable programming language, there shouldn't be any reason that it should not be possible to track when it is viewed/run.

I should think the difficult part in that would be finding the libraries (or making the functions yourself) to do the networking portion of the logging.

One quick note, however, on functionality like this it is probably best if you make things still-accessible on failure; the reason being people tend to get upset when their media suddenly becomes unavailable which is exactly what would happen if you forced termination on failure. (Can you guarantee that your logging-domain will never change? That it will always be available? What happens in the case where the internet is not available in the user's situation?)



来源:https://stackoverflow.com/questions/8099927/tracking-code-into-a-pdf-or-postscript-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!