Image Credit: Jason Leung on Unsplash
Image Credit: Jason Leung on Unsplash

“Wait a minute. Wait a minute Doc, are you telling me you built a time machine out of a DeLorean?”

With those famous words from 17-year-old California teen Marty McFly, so begins the adventure that is “Back to the Future.” The iconic film, where Marty has to find a way back to 1985 after being transported to his parent’s 1955 high-school days in a time-traveling DeLorean built by scientist “Doc” Brown, reminds us we sometimes have to look to the past in order to move forward in the present.

At the 2022 Society for Scholarly Publishing (SSP) Conference, attendees had a chance to take a hypothetical trip to the early days of digital-first article production during the panel session “‘Back to the future’ of digital-first publishing: Where we are and where we’re going.” (Unfortunately, time-traveling DeLoreans weren’t in the cards this year!)

Moderated by Bill Kasdorf, Principal of Kasdorf & Associates, the session focused on the trajectory of single-source journal production up to now, wherein multiple article types (e.g., PDF, HTML) are generated from one code-based file. The panel featured Randy Townsend, Director of Publishing Operations at PLOS; Charles O’Connor, Business Systems Analyst at Aries Systems; and Brian Cody, CEO and Co-Founder of Scholastica. Panelists ruminated on how current single-source article production practices compare to past predictions and the best ways forward based on those learnings.

So what were the key takeaways? We cover highlights below!

The promises of digital-first publishing are real, but uptake has been slow

As discussed by Charles O’Conner at the SSP session and in a 2015 Journal Article Tag Suite Conference Proceedings paper (O’Conner et al.), single-source production started catching on in the early 2000s as research stakeholders began exploring how the transition to a digital world could revolutionize scholarly publishing. Predictions included that single-source would eliminate manual article formatting/typesetting work leading to more streamlined publishing processes with less room for error, since edits made to source files would universally apply to all other outputs. All of this could, in turn, help lower the cost of publishing and speed up research dissemination.

Fast-forward to the present, and there’s no question that many of the anticipated benefits of single-source production have come to fruition. During the “Back to the Future” session, Randy Townsend pointed to the COVID-19 pandemic as an example and amplifier of how single-source production can increase efficiencies and help publishers become more agile.

“We had to react in one of the worst circumstances — it really was one of those disaster scenarios,” said Towsend. “The abilities that XML unlocks for rapid content creation and to deliver a great experience for authors is invaluable.”

However, despite the many advantages of single-source production, Townsend, who helped lead the American Geophysical Union’s (AGU) transition to an XML-first workflow before moving to PLOS, noted that only a portion of publishers are taking advantage of such options. “If I were to look at the collective we, I’d say we’re at about fifty-fifty in terms of where we need to be,” he said. Some are doing well, but we need to bring those who aren’t there yet up to speed.”

Cody added that change is brewing. “I agree it’s a mixed bag. But my read is when we talk about full body text now, there’s the idea that using XML as the canonical file is important and better. If anyone here has followed a journal’s guidelines for proofing articles, it’s easy to miss stuff, and it’s not the most fun. I think most are starting to see there’s a benefit to having a computer do that for us instead.”

Before we needed digital-first toolsets — now, we need to shift print-based mindsets

Considering reasons for the delay in single-source production uptake, O’Conner pointed to gaps in tooling in the early days of digital-first publishing. “The title of this session implies that at one point there was a promise of XML workflows that wasn’t fulfilled, but now maybe it can be,” he said, noting that was indeed the case for a while.

“In the early days, the toolset wasn’t there to edit XML effectively or make PDFs. Unless you were willing to invest in a high-end system like 3B2 that required a skillset to be mastered by people with technical abilities, not publishing people,” said O’Conner. “There was a lot of disillusionment about what XML could do.”

Today, less specialized options are emerging, like Aries’ WYSIWYG XML editor LiXuid Manuscript that connects to other Aries workflow solutions and Scholastica’s software-based single-source production service, which can integrate with any existing journal workflow. With the advent of such options, panelists agreed the challenge to more widespread implementation of single-source production workflows is no longer a lack of tools so much as the need to shift publisher mindsets away from print-based tendencies.

“We’re still in a print-centric world talking about things like page counts and Article Processing Charges,” said Towsend. “I’m a big semantics person, and how we’re framing this conversation is a little bit problematic. It’s hard to move forward if you’re straddling the fence and still in a mindset that I won’t call outdated, but I’ll say is traditional and nostalgic. When you’re thinking about where your business is going, it can’t be where it was anymore — change is the only constant.”

Panelists agreed that publishing organizations of all sizes should focus on assessing the business case for adopting single-source production models. “If print has primacy, it’s expensive. We spend a lot of time on print layout, and if that’s not the most important thing in the long run, it may not be the best use of resources,” said Cody. “Fine-tuning PDFs instead of the XML may not be the right thing for our industry.”

Towsend also spoke to different cost-benefit examples. “Think of what your version of record is. If you don’t have an XML workflow, if you have to do a correction, you’ll find yourself spending time and money fixing multiple outputs. XML could save you that. When you think about our broader financial picture, at a time when so many are still trying to navigate the impacts of the pandemic, I think all of this matters.”

A barrier to achieving truly automated processes discussed during the session is the emphasis many publishers and authors still place on having highly stylized PDFs, which generally require some human intervention in InDesign. Panelists spoke about the merits of retaining print-like PDFs for publication branding as well as the drawbacks, including increased production time and costs in many cases.

Everyone agrees on XML, just not where it should go in the production process — and that’s OK

“XML-first” was a workflow term coined in the early days of digital-first production that’s now become nearly synonymous with the term single-source in some circles, but the meaning isn’t always so clear.

During the “Back to the Future” panel, Kasdorf posed the question, “when you’re talking about XML-first, what do you mean by first? There are various appropriate answers to that question that are not the same,” he added.

As noted in “The four roads to XML,” a 2017 SSP session by Inera and Typefi, the “original XML dream” was to have authors create XML documents. Editors would then work within the XML and use it to generate print-ready PDFs, HTML, and any other article file types needed.

However, the reality is most authors write in Word or LaTeX, and it can be hard to train them in XML-based systems. This has led to software providers focusing on creating WYSIG XML editors similar to Word, like LiXuid Manuscript, or single-source workflows that place XML at later points in the publishing process, so authors don’t need to work in it, like Scholastica’s production service.

During the “Back to the Future” session, the panelists spoke about various single-source production workflow approaches and the advantages and disadvantages of each, many mirroring those identified during Inera and Typefi’s 2017 session. Ultimately, the discussion suggested that the unique needs of publishers will dictate the “best” approach for them.

Kasdorf added, “often what people really want is structure first. It isn’t necessarily XML-first in all of those stages.”

Considering the needs of publishers, Townsend said making the production process as smooth as possible is key, regardless of how you get there. He said that starts with factoring production into earlier stages of publishing. “Before this session, I was reading old Scholarly Kitchen articles about where people projected we would be with XML. One of the things that stood out to me is that production is still left out of the conversation when you’re thinking about workflows,” he said. “And I think that’s to our disadvantage.”

A key opportunity to connect editorial and production teams that panelists spoke to is integrating peer review and production systems to move metadata downstream more effectively — leading to the next point on interoperability.

Publishers are still chasing interoperability with some gains in recent years

Seamlessly integrating single-source production processes with peer review and hosting systems as well as discovery services, something both Aries and Scholastica are working towards, has the potential to improve the flow of metadata throughout the entire publishing ecosystem.

Speaking to this, O’Conner noted, “the reader is a customer, and the author is a customer as far as I’m concerned. But you have funders as customers too, and that’s where the metadata matters. I think the real driver here is the metadata and the recognition that there are a host of other customers.”

A challenge publishers and service providers are grappling with now is how to automate the production of the many different types of metadata outputs needed for discovery services, which has proven to be a hurdle to achieving widespread interoperability. “JATs is just the jumping-off point. There are many ‘flavors’ of JATS XML to convert,” explained Cody, citing PubMed Central compliant JATS and Silverchair JATS as examples. “It presents a challenge and opportunity for code-based processes — conversion is tricky but can have major payoffs. We don’t yet have a truly universal data structure, which is the real dream.”

To get closer to having interoperable systems, Towsend again noted the need to factor production into earlier publishing stages. “I think production people should be part of discussions upfront because they can prepare for considerations that are so important like syndication, indexing, and open data — those things we haven’t even agreed on yet. Having them be part of the conversation can help shape how these things evolve.”

Bringing it back to the future

The “Back to the Future” session proved that scholarly publishers have made great strides towards single-source production workflows since the early days of digital-first publishing. As single-source production processes and models become more established, the main challenge appears to be finding the right balance between upholding print conventions, where beneficial, while transitioning to digitally-driven mindsets.

Closing out the session, Kasdorf said, “notice that the title of the session is digital-first. I would argue that’s a product issue and not necessarily a technical issue. Publishers are trying to move their operations from a print mentality to a digital mentality. In other words, the digital product comes first, and the print is a derivative. That’s a major shift, and we already see several of the biggest higher ed publishers in the world don’t do print anymore.”

Tales from the Trenches