PDF standards
What is important to understand about PDF standard and its different specifications, is that each next version doesn’t eliminate previous ones. Each next version expands the format capabilities, but that doesn’t mean that all of them must be used when creating PDF documents, or that the documents created according to previous specifications became obsolete with PDF 2.0 introduction. In fact, there are still few PDF 2.0 documents out there, and even less of them use all the latest features of the format; while the majority of documents, created even now, are PDF 1.7 or even earlier. The reason for that is simple and basically corresponds to the PDF format philosophy and purpose: if an earlier specification is enough to represent the content of a document well, it’s better to be used to provide maximum compatibility with different PDF software.
PDF 2.0
Published in July 2017, PDF 2.0 is a significant step in the format evolution. It’s a major update and refinement of the PDF specifications, accumulated through all the previous years. Although support of a number of new features have been added, the main goal of PDF 2.0 creation, and its main outcome was to consolidate, clarify and clean up the specification. It was thoroughly revised to be easier to understand and cleaned from legacy features. It now provides much clearer directions for developers creating PDF software that will ultimately result to better user experience with PDF documents. Its development by an independent group of industry experts under the ISO (International Organization for Standardization) procedures also paved the way for innovations to be added more effectively in the future. PDF 2.0 defines the potential of the PDF format for the years ahead, the potential of which is yet to be developed, and we’ll surely see further updates in the future.
PDF/UA
“UA” stands for “Universal Accessibility” and PDF/UA is a specification that defines how to make a PDF document readable by assistive technologies (special software or even devices), so that a computer can read the content of such document aloud to anyone who depends on these technologies. As PDF documents became common in our lives, and especially in such spheres as public services, banking, utility, employment, medical and many other types of services, ensuring equal and easy accessibility to them is really crucial.
A PDF/UA document has a clearly and correctly defined and properly described logical structure. Using this structure description, assistive technology will know and will be able to tell what is the heading of the document, in which order to read the document’s paragraphs and text columns, what the lists are, where the pictures are and what they depict, skip reading repetitive headers and footers with page numbering, and so on.
FineReader can both convert existing PDF documents of any type into PDF/UA ones, and create PDF/UA documents from files in other formats, such as DOCX, XLSX, PPTX, RTF, image files, and others. This is possible thanks to ABBYY OCR technology, which can analyze any document structure regardless of its format.
PDF/A
PDF was created as a format that can provide interoperability across different software, computers and platforms. PDF/A extends the idea across time: it ensures that PDF documents can be opened in the future.
PDF/A is a PDF format for use in archiving, long-term preservation, and exchange of electronic documents. The visual appearance of electronic documents is maintained over time, irrespective of the tools and systems used for their production, storage and reproduction. Source documents can be paper ones, emails, “usual” PDF documents, webpages, and many others. PDF/A provides a reliable digital snapshot of any document, which remains searchable and fully actionable, for:
- Document management
- Automated workflows
- Legal records management
- Mail archiving: paper, fax or e-mail
- Document archiving, archive migration (e-Government, Legal, etc.)
Because of its ability to provide stable, uniform representation of documents across time and platforms, PDF/A is also used as the format for document management and ongoing document exchange. It is “digital paper,” as reliable, unchanged and permanent as paper documents, which we are used to trusting.
Variants of PDF/A
1, 2 or 3 refer to PDF/A conformance parts, which are, basically, format evolution stages. The earliest was PDF/A-1, while PDF/A-3 is the latest. In general, the larger the number, the more capabilities are permitted in a PDF/A-compliant document.
b, u and a signify conformance levels, which define for which specific purposes conformance is ensured. The conformance increases in exactly this order: from b that ensures the least, to a that requires all possible conformance according to PDF/A specification:
-
b (“Basic”)
ensures preservation of visual appearance of a document when viewing or printing;
-
u (“Unicode”)
in addition to b, requires mapping all characters to Unicode. This ensures that the texts can be displayed correctly and that the documents remain searchable. ‘u’ was introduced as a separate level only starting from PDF/A-2;
-
a (“Accessible”)
in addition to b, requires mapping all characters to Unicode and having document structure information. This ensures preservation and correct interpretation of document content and logical structure (reading order), when interpreted by assistive technologies, for example.
PDF/A-1 |
PDF/A-2 |
PDF/A-3 |
|
---|---|---|---|
Based on: |
PDF 1.4 |
PDF 1.7 |
PDF 1.7 |
Conformance levels: |
b, a |
b, u, a |
b, u, a |
Essence: |
Sets usage restrictions and requirements to fonts, colors, annotations, and other elements for a PDF document to be self-contained and capable of reproducing the look for a long time. |
||
Attachments: |
No attachments allowed |
PDF/A attachments allowed |
Allows attachments of any file format (CAD models, audio, video, XML data, Excel spreadsheets, Word documents, etc.)* Attachments must be linked to specific parts of the document, and the relationships between them and the document must be specified. |
Password protection: |
No password protection allowed |
||
Usage specifics: |
PDF/A attachments allow to keep sets of related documents, keeping the whole set PDF/A compliant. |
Attachments of various types allow archiving or unifying exchange even for complex documentation sets that contain information in formats that cannot be converted to PDF. Other applications or systems can work directly with attachments in their own formats.** |
* Using PDF/A-3 does not guarantee that the included attachments will remain usable in the future – it only allows their presence attached to the document.
** For example, an XML representation of an invoice content may be attached to the conventional visual invoice representation, which allows processing such invoice in ERP systems both manually, based on its “image”, or automatically using that XML data.
How to check for PDF/A compliance
ABBYY FineReader PDF, and many other PDF software applications, indicate whenever a PDF document is the PDF/A format to prevent the user from actions that would make it invalid, such as adding password protection. When a detailed check of PDF/A compliance is needed, there are third-party validation tools available.
ABBYY FineReader supports creation or conversion of documents into any of the PDF/A variants. Which variant to select depends on the tasks, workflows and requirements of the user.