Evaluating Adobe’s new cloud-based auto-tagging feature for PDF accessibility
James Baverstock | 23 Jul 2024James Baverstock is a Principal Accessibility Consultant at AbilityNet, specialising in consultancy, training, and auditing for document accessibility. As a certified Accessible Document Specialist (ADS) from the International Association of Accessibility Professionals (IAAP), he is well-versed in the latest advancements in the field.
Adobe has introduced a new cloud-based auto-tagging feature for PDFs that promises to make it easier to produce accessible PDF documents. In this article, James explores how to access this feature and how well it compares to Acrobat Pro’s previous auto-tagging functionality.
Why tags are important for PDF accessibility
Good tagging is critical for PDF accessibility. Tags work like invisible labels that provide semantic meaning for elements like headings, lists and tables for screen readers and braille displays, as well as defining the reading order for assistive technology.
You can create well-tagged PDFs if you follow best practice in authoring programs such as Word or InDesign, but many PDFs are created with no tags or bad tagging that is positively unhelpful.
Remediating an untagged or badly tagged PDF entirely manually with Acrobat Pro can be very time consuming, but a quicker option can be to use Acrobat’s auto-tagging functionality initially to add tags and then manually review and adjust the tags produced.
Unfortunately, Acrobat’s old auto-tagging feature had a lot of limitations, but Adobe has now introduced a new cloud-based version that leverages artificial intelligence and is advertised to be much improved.
Accessing the new auto-tagging feature
This new auto-tagging option is cloud-based and requires you to explicitly approve it to be used.
To turn it on in Acrobat Pro, use: Edit > Preferences > Accessibility > Enable cloud-based auto-tagging for accessibility.
When you open an untagged PDF in Acrobat Pro, you can run auto-tagging via the “Autotag Document” option in the “Accessibility” menu.
Adobe has stated that there are some limitations on content that can use the new feature, which are listed below:
- Large files (larger than 100MB or with more than 200 pages (or 100 pages for a scanned PDF)
- Files hosted on cloud storage
- Secured and protected files that don’t allow copying
- Content in languages that don’t use a Latin-based alphabet
- Form fields
- Text within annotations
If your file cannot be tagged using cloud-based auto-tagging, then Acrobat Pro automatically falls back to using the old, local auto-tagging method instead.
How well does it work?
To test how well the new auto-tagging works, I ran both the old and new auto-tagging on some different types of content to see how they compared.
The new auto-tagging does seem to be a considerable improvement over the previous version. Among the positive factors are:
- Auto-tagging of headings is much improved – the old, local auto-tagging just seemed to use the size of the heading to decide what heading level it should receive and heading levels (even for the main heading on the page) were often wrong.
- Auto-tagging is now more capable of recognising when lists and tables break across pages that they should be tagged as a single list or table across the pages and not as two separate elements on each page. Nested lists are also better tagged.
- Tables without visible borders between cells are potentially now recognised as tables when they were not previously, and column-based layouts are better dealt with than previously.
- Local auto-tagging was unable to recognise headers and footers. These would be tagged as paragraphs instead of converted to artefacts as recommended. Cloud-based auto-tagging is much better at recognising header and footer text.
There are still limitations though. In particular:
- Alt text is not added for images, so this needs to be added manually using the “Set Alternate Text” tool. (This seems sensible though, given the current state of AI-generated alt text, and avoids the risk of adding inappropriate automatically generated text.)
- The tagging of tables (particularly complex tables) is sometimes not ideal and needs further remediation.
- More complex layouts can cause misinterpretation of reading order and heading levels – although these seem to be interpreted much better than they used to be for more simple layouts.
In conclusion, the new auto-tagging is a good improvement for Acrobat that can definitely save time in your PDF remediation, but it is not perfect - so it remains very important to manually review the tags it produces and adjust them if necessary.
Further resources
- Adobe: Enhance document accessibility with cloud-based auto-tagging
- We offer help with PDF accessibility, such as our PDF accessibility training, guidance documentation and training workshops
Get in touch to discuss your needs
- The JISC June 2024 accessibility clinic featured presentations from Adobe on auto-tagging, including details on automating auto-tagging at scale via its API.