ROME - ROME2 1st Proposal (June 10th 2006) NOT CURRENT

ROME2 1st Proposal (June 10th 2006) NOT CURRENT

It has been 2 years since ROME started and along the way we've fixed, improved, enhanced and changed the API and the implementation. We've had to address things we did not think at first, we have to add support for specifications that were not around when ROME started, doing that has not being a hard task. The best indicator we've done a fair job with ROME is adoption.

Some cracks are starting to appear, mistakes in the design (such as overloading the use of DCModule), a bit of implementation over-engineering (such as the home grown bean-introspector and the smart ObjectBean & co classes), a more complex than needed API (the existence of 2 abstraction levels for feed, Synd and Wire beans).

This proposal attempts to address the problems and limitations we currently have with ROME.

Backwards Compatibility, Support and Upgrade

ROME2 will change the API breaking backwards compatibility (after all we are an Open Source Project and that is what they do best).

We will maintain ROME 1.0 (bugfixing only) for 1 year to allow a smooth transition for all ROME users. We will also prepare migration/upgrade tips (based on our own experience) for the ROME community.

Leveraging New Language Features

ROME2 will make use of Generics to type collections in its beans. It will also use the enum construct when applicable.

Minimum Number of Dependencies

ROME2 will implement its core competency (Atom and RSS parsing, generation and manipulation), it will leverage other components as much as possible but it will focus, as ROME, in being as lean as possible not only in its code by in its dependencies (keeping them down to a reasonable minimum).

One Abstraction Level, 2 Models

ROME2 will not have an abstract representation of feeds (the Synd beans). It will have 2 distinct models, Atom and RSS, both of them first citizens.

ROME2 users will use the model that fits more their needs.

Conversion (automatic and implicit) between the 2 models will be part of ROME2.

No Interfaces, Pluggable Beans

Interfaces are good, in fact they are great. However, when using them with beans that have to be created by the developer it ads noise to the code. Interfaces are used for all parameters and variables but implementations must be used to create the bean instances thus hard-coding areas of the application to a specific implementation. Or a factory pattern has to be use throughout the code to hide the implementation, with a side effect of removing clarity from the code.

ROME2 will bring the best of both worlds, it will allow coding such as

Feed feed = new Feed();
Entry entry1 = new Entry();
Entry entry2 = new Entry();
feed.getEntries().add(entry1);
feed.getEntries().add(entry2);

While allowing pluggability of the beans implementation. The pluggability will be achieved using a combination of a factory pattern and a self-proxy patterns, both of them transparent to the ROME2 user.

For example the ROME2 API bean for the Atom feed would be something like:

public class Feed {
    private Feed feed;

    public Feed() {
        if (this.getClass() == Feed.class) {
            feed = BeanFactory.getFactory().create(Feed.class);
        }
    }

    public final Feed getImplementation() {
        return feed;
    }

    public Text getTitle() {
        return feed.getTitle();
    }

    public void setTitle(Text title) {
        feed.setTitle(title);
    }

...
}

The ROME2 (default/provided) implementation bean for the Atom feed would be something like:

public class FeedBean extends Feed {
    private Text title;

    public Text getTitle() {
        return title;
    }

    public void setTitle(Text title) {
        this.title = title;
    }

...
}

ROME2 users will use the ROME2 API bean as plain objects that they are. The ROME2 API beans will delegate all their properties to an instance implementing their corresponding class. This instance will be the implementation bean which is created by the BeanFactory.

To provide alternate implementation beans an alternate BeanFactory has to be provided.

To write an alternate implementation beans the ROME2 API bean has be used as an interface, all property methods have to be overridden, nothing else.

When extending ROME2 API beans, for example providing a new module bean, it is left to the implementor to follow this pattern or to use plain beans for the the additional module.

For cases (they shouldn't be many) that a ROME2 user needs to manipulate the implementation bean, all ROME2 API beans will give access to it via the getImplementation() method.

Using multiple Bean implementations simultaneously

ROME2 will support the use of multiple implementation beans simultaneously.

A use case could be retrieving a feed from a persistent store (which uses its own implementation beans), retrieving a feed from the internet (using the default implementation beans), merging their entries and producing an output feed, into the persistent store or out to the internet.

ROME2 public API will include the following classes for this purpose:

public abstract class BeanFactory {
    public abstract <T>  T create(Class<T> beanClass);
}

public class BeanFactoryInjector {
    public static void execute(BeanFactory factory, Runnable runnable) { .. }
}

When these two classes are not used explicitly ROME2 will use the default BeanFactory that creates default implementation beans.

For alternate implementation beans a BeanFactory has to be provided. This factory will be responsible for creating a complete ROME2 bean family.

The code to create ROME2 beans with an alternate factory will have to be written in the run() method of a Runnable object and it will have to be executed via the BeanFactoryInjector.execute() method.

The created ROME2 beans could be used outside of the Runnable.run() method, because of this ROME2 users have to be aware that it is possible to mix and match implementation beans and that may not always have a happy ending if not thought properly.

Collection Elements

All collection properties (such as the authors, categories, contributors, links and entries of an Atom feed bean) will only have getter methods in the ROME2 API beans, no setter methods. For example:

public class Feed {
    public List<Person> getAuthors() { ... }
    public List<Category> getCategories() { ... }
    ...
}

This will ensure that the implementation bean has control on the implementation of the collection being used. This is particularly important to enable alternate implementation beans to fetch data from a repository as the collection is iterated over.

Persistent experts we need your input here.

Modules

Feeds, both RSS and Atom, are extensible via namespaced elements at feed/channel and entry/item level, these namespace elements are known as modules. Beans supporting modules (feed, channel, entry and item) will all have the following method to support modules.

public Map<String,Module> getModules() { ... }

Because modules are uniquely identified by their URI, a Map will be used, the key will be the URI of the module and the value the module itself. As with other ROME2 API bean collection properties, the getModules() property has a getter only, the Map implementation is controlled by the implementation bean.

Module URIs in ROME2, as all the links URLs in the beans, will be Strings thus reducing object creation explosion a bit. For cases that some URI manipulation is required, the JDK URI class should be used.

Dynamic Modules Support

ROME2 modules design has to ensure it provides simple and comprehensive support for dynamic modules such as SLE and GData.

Unknown Modules

A special Module subclass, UnknownModule, will serve as placeholder for module data not explicitly processed by ROME2 available parsers/generators. This will allow to consume and re-export a feed without losing unknown (to ROME2) modules. Each unknown module will be keyed off with its URI in the map of the associated feed, channel, entry or item.

The data of the unknown module will be available as a String. ROME2 users should not use =UnknownModule='s data directly, if they need to manipulate that data they should find or implement bean/parser/generator for the module they need to manipulate.

Because unknown modules brings some extra overhead in the parsing and generating process ROME2 will have a property to enable/disable processing of unknown modules.

The xml:lang Attributes

All ROME2 API beans will have xml:lang attributes, String and enum and primitive types properties won't.

The xml:base Attributes

The xml:base attribute will not be present in any ROME2 API bean.

It will be the responsibility of the parsers to resolve any relative URL present in the feed at parsing time. Similarly, generators may relativize URLs as a size optimization (for God's sake we are doing XML).

Conversion between Feed Types

A FeedConverter class will provide conversion from Atom to RSS beans and vice versa, both at feed/channel and entry/item level. All properties, including modules data will be copied, after an conversion the state of the beans will be completely independent.

Object Class Methods in ROME2 Beans

The equals() and hashCode() methods will not be overridden. All ROME2 beans are mutable, it is not safe to use them as keys of hash structures.

Cloning will not be supported by ROME2 beans, instead they will have a copyFrom() method which is type safe and does a deep copy. Cloning a ROME2 bean will be a two step process, for example:

Feed feed1 = new Feed();
...
Feed feed2 = new Feed();
feed2.copyFrom(feed1);

The toString() method in the beans will print the property that most likely identifies the bean plus the class name of the implementation bean. For example for a Feed bean the output of toString() would be:

[http://foo.com/atom.xml - xxx.rome2.impl.pojo.atom.FeedBean]

--++ Plugins ClassLoading

Bean implementations, parsers, generators, converters and all pluggable ROME2 components will be loaded using the same classLoader ROME2 core classes are loaded from.

This is to keep simple the handling of them via singletons and their instantiation. Supporting different classLoaders would require complex handling for instantiation, to avoid missing classes and class mismatches.

As an example of how easy things can go sour, imagine ROME2 core classes in the common path of a web-container and 2 web-apps adding their own beans/parsers/generators/modules, the singletons managing them are in ROME2 core, managed by the common classLoader, a simple singleton won't cut it. ROME2 core will be small enough that it will not be a memory consumption issue if it is once in each web-app.

Parsers and Generators

XML parsing and generation will support stream mode. We may even consider using streaming as the underlaying default implementation even if manipulating a feed bean on its whole in memory. We have to see if how this could be done leveraging Abdera, else using StAX API directly. This also means we may get rid of the JDom dependency.

For example, the streaming version of an Atom parser and generator would be something like:

public interface AtomReader {

    // repeatable operation, returns was has been read from the header so far
    Feed readFeed() throws FeedException;

    // returns null when reaches the end of the feed
    Entry readEntry() throws FeedException;

    void close() throws FeedException;
  }

public interface AtomWriter {

    // if called must be called once and before a write(Entry)
    void write(Feed feed) throws FeedException;

    // if the first write is for an entry, then the output is an entry document
    abstract void write(Entry entry) throws FeedException;

    void close() throws FeedException;
  }

As with ROME, in ROME2 Parsers will be as lenient as possible. Generators will be strict.

Feed Validators

A bean FeedValidator class will verify that a feed or entry bean is valid for a given feed type. Feed generators would use this class to ensure feed correctness.

Feed Conversion

Conversion from Atom beans to RSS beans and vice versa will be done by a FeedConverter class that would have the following signature:

public class FeedConverter {
    public Feed convertToFeed(Channel channel, boolean processItems) { ... }
    public Entry convertToEntry(Item item) { ... }
    public Channel convertToChannel(Feed feed, boolean processEntries) { ... }
    public Item convertToItem(Entry entry) { ... }
}

Because the mapping from Atom to RSS elements is sometimes subject do discussion and different requirements the FeedConverter class will be pluggable. It will implement the self-proxy pattern ROME2 API beans implement.

Parser and Generator Filters

Manipulation of feeds during parsing and generation at feed/channel and entry/item level will be possible by implementing readers and writers wrappers that work on top of the original reader and writer instances.

This filtering/wrapping could be automated via configuration.

Sources

rome2proto.zip