What does it mean for a PDF file to be certified as authentic? The short answer is that it has been digitally signed by the authority that created it. I have written several previous blogs about digital signatures. PDF supports two kinds of digital signatures: approval signatures and certification signatures. Any number of approval signatures may be applied to a PDF document but only one certifying signature may be applied and it must be the first digital signature. Approval signatures are used in the same manner as the ink on paper signatures we are all familiar with. Certification signatures are considered a part of creating the PDF file so only occur once at the beginning.
Today we are only going to discuss certification digital signatures. The idea of a certification signature is to make sure that the document is authentic and has been unaltered since it was signed by the authenticating party. Don’t you want to know that the PDF statement you receive from your bank did, indeed, originate with your bank and hasn’t been tampered with since they created it? Of course, and that is what certification signatures are all about.
One other important thing to note is that any file attachments within the PDF document can also be covered by the same certification as the PDF file itself since those attachments are part of the PDF contents. This offers a very cool way to deliver information using file types other than PDF, by enveloping them in a PDF file that has been certified. And as explained in my blog about attachments the attachments may be compressed when attached to the PDF file making them smaller for transmission. This is especially cool for the delivery of certified XML data files. The enveloping PDF can also have pages describing the attachments and how to process them.
Example walkthrough
It is instructive to use a real example to show you about authenticated (certified) documents. The US Government Printing Office (GPO) delivers some certified PDFs from their website. This is a quote from the GPO home page:
"The U.S Government Printing Office (GPO) provides publishing &
dissemination services for the official & authentic government
publications to Congress, Federal agencies, Federal depository
libraries, & the American public."
The GPO certifies documents using PDF digital certification signatures and then puts those documents onto its website. Here is one you can look at and which I will use as an example.
If you open it in Adobe Reader either within your browser or in Adobe Reader directly you should see this:
Other PDF readers should also display some kind of verification information when a certified file is opened. If not, get a better reader.
Using Adobe Reader, the blue band at the top will display when a document has been certified with a digital signature as this one has. The blue band is normally not there. What the band displays is
“Certified by Superintendent of Documents <pki support@gpo.gov>, United States Government Printing Office, certificate issued by GeoTrust CA for Adobe.”
The blue band also displays an award-type ribbon on the left end to indicate that the document has been validated, just now when opened by Adobe Reader, and found to be authentic according to the digital signature that the GPO applied.
Special Note: The phrase "certificate issued by GeoTrust CS for Adobe" means that the GPO obtained a signing certificate from the certificate authority (CA) GeoTrust. The "for Adobe" has alarmed some people, but it just means that the certificate references the Adobe root certificate that comes with Adobe’s products. Adobe and GeoTrust, as well as many other CA’s, have a written contract that allows those vendors to reference the Adobe root (requires a secret key from Adobe) in exchange for guarantees on the level of integrity they enforce when issuing certificates. The bottom line for someone seeing this blue band with that reference means that the signing agency had to meet strict requirements to get that certificate and that the document authenticated successfully. This does not mean that Adobe has any lock on such certifications. It is OK for some other vendor to put the Adobe root certificate into their root store and also be able to validate these documents. Anyone can purchase such a certificate from many different vendors. This page gives a lot of information about certifying documents.
Another Note: In July 2010 I gave an introductory talk about the public key infrastructure (PKI) and digital signatures at the American Law Librarians Conference. You may want to consult that for more details about this signing technology.
Double Checking
As shown below, using Adobe Reader, a panel will open on the left when you click on the blue band at the top right end where it says “signature panel”. Within that panel you can open the various topics to see the status of the integrity of the file and of the certifying digital signature. In this case, below, you see that the check showed that the file hasn’t been changed and the signature checks out OK to be that one really belonging to the GPO because it cryptographically chains up to the Adobe root.
The workflow
The workflow that I have referred to that GPO uses with these PDF files is one where they put the certified PDF file on their website and the user opens it in a web browser or downloads it onto their disk for viewing. So the validation of the certification happens on the user’s machine using Adobe Reader, Acrobat Standard, Acrobat Professional or any vendor’s reader as the document is opened. Some readers may not do this checking automatically or at all. If you want to have the document authenticity checked, make sure your reader does that, get a reader that does or use the Adobe Reader.
Here are the details of how those PDF documents are read and check for authenticity instantaneously as you open them on your machine.
A customer of the GPO downloads such a certified PDF file. She opens it in Adobe Reader either within a browser or not. The steps that the Adobe Reader takes before showing the blue banner are: it computes the cryptographic hash of the PDF file as downloaded onto her disk. The encrypted digital signature inside the PDF file is decrypted using the signer’s public key (which is also provided inside the PDF file). The decryption can only be successfully accomplished if it had been encrypted by the private key corresponding to the signers public key (the one belonging to the GPO). The cryptographic hash that was put into the digital signature by the GPO is compared with the one we just computed. If they match then the document hasn’t been changed since the GPO signed it. If they don’t match then some tampering has happened. This is reported to the person opening the file in the blue band at the top.
Further, the certificate for the signer is checked to see if it “chains” successfully up to the Adobe root (or some other root that has been agreed to or that is commonly available). This chaining operation also uses the public/private key pairs and encryption to verify that each certificate was indeed the one issued by the CA that it says it was issued by. If all this works, then we know that whomever is identified as the signer that certified this document is indeed who they say they are according to the CA’s. The success or failure of this is also reported to the person opening the PDF file. In some cases the software opening the file may attempt to go onto the Internet to check with CA’s to make sure that the certificates are still valid and have not been revoked.
In this example, the GPO obtained a certificate from the GeoTrust Certificate Authority (CA) which chains to the Adobe trusted root.
I took that file and modified one byte in it. Now when I open it I get this in the blue banner indicating that it has been tampered with (is invalid).
Making certified documents
Here is how those PDF documents are made and put onto the GPO website:
At the GPO, the files are signed with a certificate owned by the GPO and issued by a Certificate Authority (CA) who issues them against the Adobe root. If you have Adobe Acrobat Professional and a certificate you, too, can perform this certification step. Just go to the menu item Advanced->Sign & Certify and pick either Certify with visual signature or Certify without visual signature. This example was certified without a visual signature. You need to have obtained a digital certificate that contains your public/private keys and identifying information in order to do this. These are obtained from CA’s like GeoTrust as shown here. Adobe, through its LiveCycle server product also provides services that can certify documents as they are created on a server.
The Adobe software, whether Acrobat Professional or LiveCycle server, saves the file to disk, then computes the cryptographic hash over the bytes found on the disk with the exception of the bytes where the digital signature will eventually be placed. It then encrypts that hash using the certifiers private key (GPO private key) and incorporates the result into a “digital signature” block as defined by various standards organizations. It then inserts this block into the hole that was left in the PDF file on disk to receive it. The digital signature contains the signer’s certificate that includes her public key and the identity of the CA that issued the certificate. Any other certificates that will be needed to confirm the validity of the signer’s certificate are also included in the digital signature or in other places in the PDF file.
This signed/certified document is then placed onto the agencies website (e.g., GPO).
Here is a version of the GPO story.
Jim King (jking@adobe.com)