Abstract
The Portable Document Format (PDF) was developed by Adobe in the early nineties and today it is the de-facto standard for electronic document exchange. It allows reliable reproductions of published materials on any platform and it is used by many governmental and educational institutions, as well as companies and individuals. PDF documents are also credited with being more secure than other document formats such as Microsoft Compound Document File Format or Rich Text Format. This paper investigates the Portable Document Format and shows that it is not immune from some privacy related issues that affect other popular document formats. From a PDF document, it is possible to retrieve any text or object previously deleted or modified, extract user information and perform some actions that may be used to violate user privacy. There are several applications of such an issue. One of them is relevant to the scientific community and it pertains to the ability to overcome the blind review process of a paper, revealing information related to the anonymous referee (e.g., the IP address of the referee).
- Adobe, 2009. Adobe, 2009. Security Advisory for Adobe Reader and Acrobat. http://www.adobe.com/support/security/advisories/apsa09-07.html (December 15 2009).Google Scholar
- Adobe, 2010. Adobe, 2010. Security Updates Available for Adobe Reader and Acrobat. http://www.adobe.com/support/security/bulletins/apsb10-07.html (February 17 2010).Google Scholar
- Adobe Acrobat 7.0, 2005. Adobe Acrobat 7.0, 2005. Acrobat JavaScript Scripting Reference. http://partners.adobe.com/public/developer/en/acrobat/sdk/AcroJS.pdf (June 2005).Google Scholar
- Adobe Acrobat SDK v. 8.1, 2007. Adobe Acrobat SDK v. 8.1, 2007. JavaScript for Acrobat API Reference. http://www.adobe.com/devnet/acrobat/pdfs/js_api_reference.pdf (April 2007).Google Scholar
- Adobe Inc., 2010. Adobe Inc., 2010. Adobe 9 Pro Extended, Removing sensitive content. http://help.adobe.com/en_US/Acrobat/9.0/3D/WS4E397D8A-B438-4b93-BB5F-E3161811C9C0.w.html (February 2010).Google Scholar
- Adobe Solutions Network, 2005a. Adobe Solutions Network, 2005. Acrobat JavaScript Scripting Guide. http://partners.adobe.com/public/developer/en/acrobat/sdk/pdf/javascript/AcroJSGuide.pdf (September 2005).Google Scholar
- Adobe Solutions Network, 2005b. Adobe Solutions Network, 2005. Adobe Acrobat 7.0: PDF Open Parameters. http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf (July 2005).Google Scholar
- Adobe Systems Inc., 1999. Adobe Systems Inc., 1999. PostScript Language Reference, Third Edition. http://partners.adobe.com/public/developer/ps/index_specs.html (February 1999). Google Scholar
- Adobe Systems Inc., 2009a. Adobe Systems Inc., 2009. Adobe Portable Document Format. http://www.adobe.com/products/acrobat/adobepdf.html (Last updated December 2009).Google Scholar
- Adobe Systems Inc., 2009b. Adobe Systems Inc., 2009. Latest Product Updates. http://www.adobe.com/downloads/updates/ (Last updated December 2009).Google Scholar
- Adobe Systems Inc., 2009c. Adobe Systems Inc., 2009. Security bulletins and advisories. http://www.adobe.com/support/security/ (Last updated December 2009).Google Scholar
- Adobe Systems Inc., 2010a. Adobe Systems Inc., 2010. Adobe PDF Reference Archives. http://www.adobe.com/devnet/pdf/pdf_reference_archive.html (Last updated January 2010).Google Scholar
- Adobe Systems Inc., 2010b. Adobe Systems Inc., 2010. Examine a PDF for hidden content. http://help.adobe.com/en_US/Acrobat/8.0/Professional/help.html?content=WS7E9FA147-10E3-4391-9CB6-6E44FBDA8856.html.Google Scholar
- AIIM, 2009. AIIM, 2009. PDF Reference Bibliography. http://www.aiim.org/standards/article.aspx?ID=33223 (Last updated December 2009).Google Scholar
- Avira GmbH, 2008. Avira GmbH, 2008. Avira issues a warning about polymorphous harmful PDFs. http://www.avira.com/en/security_news/polymorphous_harmful_pdfs.html (November 2008).Google Scholar
- Bagley et al., 2007. Extracting reusable document components for variable data printing. In: DocEng'07: Proceedings of the ACM Symposium on Document Engineering, ACM Press, New York, NY, USA. pp. 44-52. Google Scholar
- Byers, 2004. Information leakage caused by hidden data in published documents. IEEE Security and Privacy. v2 i2. 23-27. Google Scholar
- Castiglione et al., 2007. Taking advantages of a disadvantage: digital forensics and steganography using document metadata. Journal of Systems and Software, Elsevier. v80. 750-764. Google Scholar
- Chao and Fan, 2004. Layout and Content Extraction for PDF Documents. Lecture Notes in Computer Science LNCS. v3163. 213-224.Google Scholar
- Electronic Frontier Foundation, 2010. Electronic Frontier Foundation, 2010. How Unique - and Trackable - Is Your Browser? http://panopticlick.eff.org/ (January 2010).Google Scholar
- Elsevier B.V., 2009. Elsevier B.V., 2009. Customer Support, Reviewer Attachments Not Sanitised by EES. http://epsupport.elsevier.com/al/12/1/article.aspx?aid=2090 tab=browse bt=4n (December 2009).Google Scholar
- Foxit Software Company, 2010. Foxit Software Company, 2010. Foxit Reader. http://www.foxitsoftware.com (February 2010).Google Scholar
- F-Secure Corporation, 2001. F-Secure Corporation, 2001. F-Secure Virus Descriptions: PDF Worm. http://www.f-secure.com/v-descs/pdf.shtml (August 2001).Google Scholar
- Futrelle et al., 2003. Extraction, layout analysis and classification of diagrams in PDF documents. In: ICDAR'03: Proceedings of the Seventh International Conference on Document Analysis and Recognition, IEEE Computer Society, Washington, DC, USA. pp. 1007-1013. Google Scholar
- Gemal, 2010. Gemal, H., 2010. Browser Spy. http://www.browserspy.dk/ (January 2010).Google Scholar
- King, 2004. A format design case study: PDF. In: HYPERTEXT'04: Proceedings of the Fifteenth ACM Conference on Hypertext and Hypermedia, ACM Press, New York, NY, USA. pp. 95-97. Google Scholar
- McKinley, 2008. McKinley, K.S., 2008. Improving Publication Quality by Reducing Bias with Double-Blind Reviewing and Author Response, ACM SIGPLAN Notices. http://www.cs.utexas.edu/users/mckinley/notes/blind.html (August 2008). Google Scholar
- NIST, 2002. National Institute of Standards and Technology (NIST), 2002. The Keyed-Hash Message Authentication Code (HMAC) (FIPS PUB 198). http://csrc.nist.gov/publications/fips/fips198/fips-198a.pdf (March 2002).Google Scholar
- PDF Standard Committees, 2010. PDF Standard Committees, 2010. PDF Standards Implementations Wiki. http://pdf.editme.com/ (Last updated February 2010).Google Scholar
- PDF Working Group, 2002. PDF Working Group, 2002. PDF-Archive Draft Meeting Minutes. http://www.aiim.org/documents/standards/pdf-a2003-001_dec_min.pdf (December 2002).Google Scholar
- Shankland, 2001. Shankland, S., 2001. New virus travels in PDF files. http://www.news.com/New+virus+travels+in+PDF+files/2100-1001_3-271267.html (August 2001).Google Scholar
- Snodgrass, 2007. Editorial: single versus double-blind reviewing. ACM Transactions on Database Systems. v32 i1. 1 Google Scholar
- The Evince Team, 2009. The Evince Team, 2009. Evince-Simply a Document Viewer. http://www.gnome.org/projects/evince/ (Last updated September 2009).Google Scholar
- IEEE, 2008. The Institute of Electrical and Electronics Engineers, 2008. IEEE PDF-eXpress. http://www.pdf-express.org/ (August 2008).Google Scholar
- The KPDF Team, 2008. The KPDF Team, 2008. KPDF Reader. http://kpdf.kde.org/ (Last updated August 2008).Google Scholar
- US-CERT, 2000. United States Computer Emergency Readiness Team (US-CERT), 2000. Vulnerability Note VU#31554. http://www.kb.cert.org/vuls/id/31554 (November 2000).Google Scholar
- Wikipedia the Online Encyclopedia, 2009. Wikipedia the Online Encyclopedia, 2009. The Calipari Incident. http://en.wikipedia.org/wiki/Nicola_Calipari/, http://en.wikipedia.org/wiki/Rescue_of_Giuliana_Sgrena/ (Last updated December 2009).Google Scholar
- Wikipedia the Online Encyclopedia, 2010. Wikipedia the Online Encyclopedia, 2010. Peer Review Process. http://en.wikipedia.org/wiki/Peer_review (Last updated January 2010).Google Scholar
- Zhong et al., 2007. Data hiding in a kind of PDF texts for secret communication. International Journal of Network Security. v4 i1. 17-26.Google Scholar
Index Terms
- Security and privacy issues in the Portable Document Format
Recommendations
Design of Security Mechanism for Electronic Document Repository System
ICHIT '08: Proceedings of the 2008 International Conference on Convergence and Hybrid Information TechnologyThe management and deposit of paper document costs are increased gradually. Specially, it is too expensive to safekeeping paper document in the warehouse. Also paper based document system is exposed in several security problems. Therefore, demands of ...
Towards a Universally Editable Portable Document Format
DocEng '18: Proceedings of the ACM Symposium on Document Engineering 2018PDF is the established format for the exchange of final-form print-oriented documents on the Web, and for a good reason: it is the only format that guarantees the preservation of layout across different platforms, systems and viewing devices. Its main ...
A format design case study: PDF
HYPERTEXT '04: Proceedings of the fifteenth ACM conference on Hypertext and hypermediaWe explain how the Portable Document Format was designed based upon some specific design criteria that were developed to make an advance beyond previous technology. The environmental variables (computing power, business climate) that affected the design ...




Comments