DPubS System and Development
Agenda, 2004-2006
Digital Publishing System
(DPubS) is the name given to a set of software modules developed at
Cornell that together meet a range of electronic publishing needs: access,
navigation, and delivery of full-text content in a variety of file formats;
subscription access controls; e-commerce services (pay-per-view); automated
lookup and linking to other information resources (includes DOI registration,
OAI compatibility, and reference linking); usage statistics for publishers
and institutional subscribers; and appropriate safeguards against automated
downloading of resources. The origins of DPubS are in the Dienst system,
developed by Cornell's Computer Science department in the early '90s and
used for several years as the engine behind NCSTRL, a distributed network
of Computer Science technical reports. This code base has been significantly
modified and extended, and this enhanced version of the Dienst system
now supports Project Euclid.
The DPubS architecture supports
and coordinates distinct services, which are implemented as separate software
modules. In all, there are currently ten services, each supporting a functional
area of activity. For example, the Index Service indexes metadata or full-text
from a repository, or set of repositories, queries these indexes when
requested, and returns search results. The Subscription Service manages
subscription data and answers access rights questions when controlled
content is requested. Other services include Repository, User Interface,
and Registry Services. Each DPubS service has a well-defined interface.
The syntax for making every request of a particular service and the expected
format of each response are documented. The Dienst protocol formed the
basis of this documentation, and subsequent extensions (new requests and
responses) are documented in the DPubS code. This protocol is based on
HTTP verb requests. A similar idea was employed for the OAI protocol for
metadata harvesting, which borrowed directly from Dienst and resembles
it closely in design. The advantage of having such clearly articulated
interfaces is that services can be extended (the addition of new functionality,
with corresponding new requests and responses), without breaking existing
dependencies. This flexible and extensible modular design makes the DPubS
system well-suited for open source development, since existing functionality
need never be disrupted by another developer's desire to enhance that
functionality.
All DPubS services are currently
implemented in Perl, with appropriate programming standards employed.
DPubS operates in a mod_perl environment. The Project Euclid installation
of DPubS is currently running on a Sun server, under Solaris. A local
mirror installation of Euclid has been tested running Linux.
At the code level, DPubS is
richly commented and "readable" software. All files have a uniform
header with functional descriptions and modification histories. The relatively
few programmers working on the code have applied consistent documentation
styling to the actual code. The Perl documentation format POD has been
used in some modules and more extensive use of POD is being investigated.
Above the code level, a significant amount of internal documentation on
DPubS is currently maintained, in the form of "How to" manuals
for various processes: installing software, setting up configuration files,
initializing new titles, loading content, loading subscription data, etc.
This documentation will eventually be made public, most likely via web
publication.
Development
Agenda
The scope of work on DPubS
during the current project period, 2004-2006, will focus on the following
features and improvements:
- Creation
of a general-purpose publishing platform
While much of the underlying
system, as currently implemented, is independent of any particular
content type or front-end look and feel a strength of working
from the core Dienst system design, with its independent service architecture
the DPubS front end, the User Interface Service, does not support
the broad range of content types and display options that would be
desirable. In order to move beyond math literature and the Euclid
community, and to make DPubS a general-purpose publishing platform
attractive to a wide range of publishers and publications, three areas
of work are needed:
Redesign of the DPubS User
Interface Service module to allow for the implementation of a scalable
and extensible XML/XSLT architecture. This major upgrade to the system
will provide a growing and diverse cohort of publishers with the flexibility
to cost-effectively modify the look and feel of publication-specific
pages and customize any sub-components publications within a single
instance of the system.
Redesign of underlying configuration
and metadata services to support a full range of publishing entities
and object types. The redesign and rationalization of the configuration
metadata used within DPubS will allow us to support a wider variety
of hierarchical models in a more flexible manner. For example, we anticipate
needing groups, or communities, that may include several
publishers, or one publisher that offers multiple publications. Additional
work is also needed to support a more extensible object metadata model,
such as METS, in order to allow for a variety of metadata standards
and a richer range of metadata types (technical and administrative,
as well as descriptive).
Enhancement of DPubS's capability
to handle non-serial literature. The Euclid version of DPubS was designed
to receive and deliver serial literature. The structure of non-serial
literature differs in significant ways, and DPubS needs to support the
ingest and delivery of myriad document formats. In general, what is
required is the enhanced ability to handle a wider range of document
models. It should be emphasized that extending DPubS capability beyond
journals does not represent an alternative development of core functionality
of the system, but will significantly increase flexibility of application.
2. Provide on-line editorial
management services to support peer review activities
These services would provide
a suite of document management tools for use by journal as well as monograph
publishers. These tools would fit into the publishing workflow where
appropriate. An important design feature for such services would be
that the tools are operationally independent of each other or have well-defined
APIs for interacting. This will allow for staged and/or independent
development.
Editorial management services
might include:
- an on-line manuscript
submission, with automated alerts
- a reviewer database
- mechanisms (perhaps multiple)
for distributing papers to reviewers
- a tool to collect and
organize feedback from reviewers
- a tool to track accepted
papers through the editorial and composition process
- sorting/queuing capabilities,
to organize prospective journal issues
- access mechanisms for
forthcoming articles
- ability to "publish"
articles or entire issues, by easily moving final copy from the editorial
work area to the public distribution space in DPubS
3. Enhance the administrative
functionality and interface
This work would rationalize
production workflow, allowing greater segmentation of tasks and the
creation of simple tools to manage lower-level processes (adding new
publishers, adding new content, producing usage statistics, troubleshooting
user login problems, answering mail, etc.). The goal would be to reduce
the staffing cost for much of the daily/weekly administrative work by
reducing the skill level needed. We could move current staff off these
tasks and onto more demanding ones, and use temporary (student) staff
for the regular administrative work.
4. Ability to interoperate with Institutional Repositories (IR)
We anticipate broad interest
from adopters of institutional repository systems in providing electronic
publishing services via DPubS. The DPubS system could be engineered
as a layer on top of an institutional repository, using the IR for its
data storage and repository functions. Much of this work would involve
developing an API for the IR. While all institutional repositories would
be encouraged to develop interoperability with DPubS, this project will
target Fedora and DSpace, directly enabling that capability. Input and
technical support from Fedora and DSpace developers is necessary
to effectively implement this functionality.
|
 |

Latest News
June 2005
Press Release:
The journal Indonesia chooses DPubS as its electronic distribution solution
» more info
June 2005
Press Release:
Quarterly journal Pennsylvania History utilizes DPubS to deliver its archive on-line
» more info
December 2004
Press Release:
ARL Newsletter
The Development of
an Open Source Publishing System
» more
info
August 24, 2004
Press Release:
Cornell Library to Distribute its Open Source Electronic Publishing
System
» more
info
July 2004
Review:
Project Euclid: Mathematics and Statistics Journals
by Gerry McKiernan
» more
info
|