Tuesday, April 7, 2009

Distributed Proofreaders: What your work could look like if it was paginated...

I'm indebted to Juliet Sutherland for commenting on this blog, and especially for pointing me towards the work being done on books for Project Gutenberg by Distributed Proofreaders.

As Juliet points out, their aims are not exactly the same as mine. They are trying to "make the best of the Web as it is today", while I and some others argue that the way to make a lot of things better on the Web is to expand what you can do on it.

Readers of this blog will know I'm a proponent of being able to do adaptive pagination on the Web - paginated content whose layout adapts to fit the screen on which it's being viewed.

People like Joe Clark (actually, I don't think there are any other people like Joe) would have you think that I'm a heretic, who wants to "turn the Web into outmoded print layouts.

That's not what I'm advocating at all; what we have on the Web today should stay - but there are better alternatives for many types of "reading" content, and those should be added to what we already have, thus expanding the range of possibilities.

Then people can make up their minds what they prefer, instead of having to work within silly, outmoded constraints like "Web pages always have to scroll in a bottomless window.

As I've said elsewhere, that particular constraint exists only because the software engineers building the first publicly-available Web browser, NCSA Mosaic, took the easy shortcut of displaying Web content in a bottomless scrolling window in order to avoid the harder layout problem of pagination. It's become part of the fabric of the Web, and it's high time it was questioned.

Distributed Proofreaders has done a great job on books - working within the constraints of the Web today. Juliet pointed me to "The Dance of Death", a 16th Century book by Holbein, with fanastic woodcuts, as an example of how their work can adapt to different screens.

Here are some thoughts on opening up the book and examining it and its markup, and some ideas on where I'd like to see this type of content be able to go in the future.

None of this should be construed as a criticism of what DP has done (although I do have minor niggles like the use of "inches" and 'feet' marks instead of proper typographer's quotes). This DP book has been proofread and set with great care and thought, and if it's a typical example, then DP is to be congratulated on a job well done.

Since DP doesn't specify typefaces or fonts in the books it does, at first all the text appeared in Times New Roman - the default font in most browsers for pages that don't specify fonts.

Now, TNR was a great print face in its day. It's not very nice on screen, because it has a small x-height. And it really does look old and tired. Part of that is no doubt because it has been the default font for documents in word-processing software for decades. It has, not to put too fine a point on it, been beaten to death. But that aside, it does look "old-fashioned" - and not in a nice way... The first thing I did was change my default font in Internet Explorer 8 to Cambria. Instant improvement! Cambria is the best serif face for reading on screen, no question, and this book looks great in it. (I'd choose Calibri as my favorite sans serif). Another way to do this would be to switch CSS stylesheets (which Internet Explorer now supports, since Beta 2 of IE8.

Incidentally, Juliet had the gripe that many of the problems DP faces were the result of trying to get pages to work with Internet Explorer. I hope that will become a "gripe of the past" now that IE8 has shipped using Web-standards rendering by default.

Next thing was to play around with browser width, by re-sizing my browser window. As Juliet points out, the layout adapts very well to changes in browser width, which would equate to different screen sizes.

However, on a larger, modern laptop display, there's so much unused screen either side of the browser window that the "book" gets lost. There's too much distraction either side - even if it's only a large area of unused white screen.

And while the content adapts to width very well, it's still a less-than-optimal scrolling read...

Since Juliet sent me this link yesterday, I've been experimenting with different layouts to create an improved, paginated version of this book, just to show what it could look like. But this morning an email arrived from my colleague Mike Duggan in Ireland, with a really nice paginated layout.

Mike's a visual guy, a great typographer, and tends to do his mockups in Windows Paint. So this is just a .jpg. But it's easy to see how it could be created on the Web, using a CSS stylesheet plus multicolumn and hyphenation Javascripts. Doesn't it look much better than most content of this type we see on the Web? With a layout like this, in Full Screen mode, you could truly have an immersive book-reading experience.

What do we need to be able to do this on the Web? Well, the biggest obstacle is that today you'd have to paginate it manually, which is not only time-consuming, but means you have to decide upfront on a fixed size - and that's terrible.

With AJAX, you could get the window size from the operating system, and use that not only to calculate the optimum number of columns, but the depth of the columns in which the content would flow.

You really don't want to be dependent on Javascripts for multi-column and hyphenation. I've talked about the problems of a DOM-based multicolumn .js elsewhere in this blog. Those functions are much better done with the layout and composition engines of the browsers - which are much more sophisticated.

You'd like to be able to just create a Master page, then hand off the actual setting to the browser - whatever browser the reader is using - and have create as many "new pages" as it needs to place all the content.

You'd want graphics to be scaled to fit a grid determined by the AJAX calculation. That grid would be based on multiples of body-text line-height - and would change if the reader, for example, wanted or needed to read in larger type.

There would be common elements for every page. You'd need to increment page numbers. The "virtual pages" created would need to be temporarily stored in a cache somewhere so the reader could navigate the "book".

There are all kinds of details like this which would need to be thought through. And there would certainly be HTML and CSS standards developments which could help.

There are issues like the one pointed out by Richard Fink, of how you index and reference in a book which has different page numbers for different readers. (My suggestion for this is to take a leaf out of the Bible, and refer to passages by Chapter and Verse. Amazon's Kindle uses "Location numbers", which works but is pretty ugly).

That's why this kind of innovation shouldn't be done in any one browser. It needs to be a collaboration involving them all. It has to become an extension to existing Web standards.

For example, the CSS3 standard for multiple columns allows you to specify columns either as an integer number - which means column width floats, or by specifying a column width - in which case the number of columns floats. It would be nice to be able to specify upper and lower limits for column width, so you'd get smooth reflow as window size changed.

I know my own personal ideal (and that of Mike Duggan and another expert colleague, Geraldine Wade) is to use my browser in Full Screen mode. But that may not be what everyone wants, and they should be free to choose what they prefer. If our way works better for them, that's what they'll end up using.

There's a lot of work needed to make this happen. But as far back as 1984, I saw an unknown software application - running at that time only on the Apple Macintosh - which would let you create a Master Page grid for any size of page, then allow you to autoflow content into it. The application would automatically create as many new pages as it needed to place all your content.

That little application was called PageMaker, and it created an entirely new market for software, fonts, printers etc. called DeskTop Publishing (it was the era of the capital letter in the middle of words).

Is anyone really trying to tell me that the Web can't be made capable of doing what was easily possible a quarter of a century ago? We need it to become a publishing platform capable of the highest-quality layout and typography people can imagine.

I have an OnScreen Reading discussion group over on FaceBook where anyone is welcome to get together and talk about all of this. Ultimately, though, I'd like to see this discussion take place under the auspices of the W3C or the CSS working group. Maybe it's already happening, and I just don't know about it...

Word of warning: I'm not interested in flame wars over on FaceBook. People do need to be able to argue their point of view, if it's done in the spirit of moving forward - but not as adversaries, scoring points. I'm keeping myself as the only admin on that group, so I can throw off anyone I judge is getting out of hand.

No comments:

Post a Comment