If you have thousands of laid-out pages but now need their content to be structured, all is not necessarily lost.
For over a decade we had completed forms laid out using Adobe InDesign. Although key elements were placed in separate boxes, they were set up to look good in print, and lacked any structure in their content. The task was to turn more than 7500 individual files into a database, in which we could search, sort, and analyse – something conventionally viewed as requiring prolonged labour and great cost.
I had just three days, and the tools to hand on my iMac.
Searching for structure
When confronted by apparently unstructured content, the first quest is to find a format in which it can be rendered with some structure, even if that is not as fine-grained as you might wish. The most obvious routes are often the least useful, though: exporting…
View original post 709 more words