web2express.org

August 26, 2006

Develop demo release of web2x publishing tool

Filed under: semantic publishing — admin @ 7:43 pm

[::Associated Project::]

web2x publishing tool

[::Subject::]

web publishing, semantic web, RDF, self-publishing experiment, semantic publishing, open data
[::Introduction::]

User Requirements:

1. Easy to install on user’s server, and easy to write and publish to the web.

2. Allows users to publish research data in smaller units such as experiment, protocol, tool used, and project; and related information such as researchers, research group, organization, and traditional publication.

3. The data should be presented in both HTML format for human to read and RDF format for computer to process.

4. Has at least basic content management features like categorization, archiving (chronologically) and site searching.

5. Allows reader to comment on the published content.

6. Able to manage user accounts and privilege.

7. Search engine friendly.

8. Must be a web application that can be used cross many platforms..

Development requirements:

1. Use existing open source software as foundation. Ideally, just develop plugins to extend the existing code base. Try not to touch the core code base so that it will be easy to upgrade to new version of the core code.

2. Must use platform-independent language like java or php.

3. Should be relatively easy to prototype by myself in a short period of time (a couple of months).

4. Should be put into open source community for further development.

[::Hypothesis::]

[::Procedure::]

1. Survey existing open source software: content management system, wiki tools, and bloggers.

2. Pick one from each category to test: check out source code, set up development environment, code for new features, and evaluate developer support community.

3. Decide on the best code base and develop new features for it.

4. Create minimum documentation.

[::Protocols Used::]

[::Tools Used::]

wordpress, http://wordpress.org/

RAP API, http://www.wiwiss.fu-berlin.de/suhl/bizer/rdfapi/index.html

[::Data::]

[::Data Links::]

[::Result::]

1. Comparison of existing open source software that can be used for publishing:

Drupal: Great content management system. But, less publishing features and may be too complex.

Xwiki: Java-based. Solid design, but not mature enough (still bugy).

Tikiwiki: PHP-based. Navigation is not easy.

Roller: Java-based blogger. Too complex to install and use.

Wordpress: PHP-based. Easy to use, light-weigh core, plugin structure, theme selection, strong support from developer community.

In general, CMS and wiki are not optimal for scientific publishing, while blogger is closer to a publishing platform. Among many available bloggers, WordPress really stands out because it is very easy to install and use, and can be easily extended to support semantic features. Plus, it is optimized for search engines.

So, the choice is clear: WordPress.

2. Add topic-specific post writing feature to WordPress

WordPress has convenient hooks (or API) to add submenu under “post write” menu, intercept the post content before and after writing, and set custom fields for new post. Using these hooks, submenus for writing the following specific types of posts are implemented:

· Experiment

· Project

· Protocol

· Product

· Publication

· Member

WordPress also has a special type of post called “page” that is used for static and timeless content. The following specific types of information/data are implemented as “page” because each blog site may be associated with only one group and one organization, which normally does not change.

· Organization

· Group

3. Generate semantic data model for topic-specific posts

Each specific content type is defined by a class in the Self-publishing of Experiment (SPE) ontology that is being developed by another project. In the first version for demo, templates are provided for different types of contents so that users will be guided to provide data for the corresponding fields (i.e. properties of ontology classes). When the post or page is saved or published, the content in the editor is parsed to create semantic data model corresponding to the ontology class. The data model is then serialized as a RDF file.

A PHP-based RDF API called RAP is used for semantic data modeling and RDF file writing.

4. Dynamic post map

This web2x is both a publishing tool and a web site serving the published content. To make the web site friendlier to search engines, a dynamic post map feature is implemented. When a crawler fetches the post map, the map is generated dynamically. This guarantees most updated list of posts at all time. Users only need to create the post map once, and can recreate a new post map any time.

[::Conclusion::]

[::Discussion::]

There is an existing structured blog package that is also based on WordPress. It supports microformat in stead of ontology for semantic publishing and thus it’s not appropriate for my purpose. Also, it rewrites a large number of WordPress core php files, which should make it very difficult to keep up with new versions of WordPress.

[::Main Concepts::]

[::Published In::]

[::References::]

[::Researcher::]

[::PI::]

[::Start Time::]

[::End Time::]

[::Status::]

[::Alternative Web Page::]

[::Rights::]

Powered by WordPress