Blog from October, 2011

In this post we will look at migrating from a flat class to a hierarchy of classes.

Given the following situation:

We need to map specific properties for specific subclasses, but the biggest challenge is how to tell NHibernate which subclass to instantiate on load. In the old situation to differentiate between a RiverWeir and a SimpleWeir, one had to check the WeirType (string) property.

In the past, WeirType could contain the following strings:

  • simple_weir
  • river_weir
  • advanced_river_weir

We want NHibernate to use this property to decide which subclass to create, and we want to map both 'river_weir' and 'advanced_river_weir' to the RiverWeir class.

Let's first assume we don't have this 'advanced_river_weir'. In that case we can use the normal discriminator system of NHibernate (note: discriminator must come after Id!). We also map the specific properties while we're at it:

Migrating HBM for just 'simple_weir' and 'river_weir'
  <class name="Weir">
    <id name="Id">
      <generator class="guid" />
    </id>

    <discriminator column="WeirType"/>

    <property name="Name" />
    <property name="CrestLevel" />
    <property name="GateHeight" />
    
    <subclass name="SimpleWeir" discriminator-value="simple_weir" >
      <property name="DischargeCoefficient" formula="SimpleWeirDischargeCoefficient" />
    </subclass>

    <subclass name="RiverWeir" discriminator-value="river_weir" >
      <property name="SubmergeReduction" formula="RiverWeirSubmergeReduction" />
    </subclass>    
  </class>  

Now lets look at the more complex situation with 'advanced_river_weir' included. Again, formula comes to the rescue, this time in 'discriminator'. This way you can influence the value of what NHibernate sees as the discriminator value. The possible discriminator values you return must match with the discriminator-value's you supply for each subclass:

Migrating HBM 'simple_weir', 'river_weir' and 'advanced_river_weir'
  <class name="Weir">
    <id name="Id">
      <generator class="guid" />
    </id>

    <discriminator formula="case when WeirType in ('river_weir', 'advanced_river_weir') then 'RiverWeir' else 'SimpleWeir' end"/>

    <property name="Name" />
    <property name="CrestLevel" />
    <property name="GateHeight" />
    
    <subclass name="SimpleWeir" discriminator-value="SimpleWeir" >
      <property name="DischargeCoefficient" formula="SimpleWeirDischargeCoefficient" />
    </subclass>

    <subclass name="RiverWeir" discriminator-value="RiverWeir" >
      <property name="SubmergeReduction" formula="RiverWeirSubmergeReduction" />
    </subclass>    
  </class>  

Note that I have changed the discriminator-value of both subclasses and also in the discriminator formula, to indicate you can choose any.

For source code: https://repos.deltares.nl/repos/delft-tools/trunk/shared/NHibernateBackwardsCompatibility

In this post I will look at what is probably a less common situation, but it boasts slightly more complex SQL formulas and a more entwined domain model.

Here is the situation we have:

We want to merge the classes Pump and PumpDefinition into one. The old database looks like this:

We should have no trouble maintaining the Pumps list in Network; nothing changed there. So we just need to retrieve the Pump properties DefinitionName and Capacity from another table: pump_definition.

Let's start with the basics migrating HBM, which is almost the same as the new HBM:

Migrating HBM
  <class name="Network">
    <id name="Id"> <generator class="guid" /> </id>
    <bag name="Pumps" cascade="all-delete-orphan">
      <key column="PumpId"/>
      <one-to-many class="Pump"/>
    </bag>
  </class>

  <class name="Pump">
    <id name="Id">
      <generator class="guid" />
    </id>
    <property name="Name" />
    <property name="DefinitionName" ???? />
    <property name="Capacity" ???? />
  </class>

It's not finished yet, we need to do something about DefinitionName and Capacity. Let's start with the DefinitionName. Using the formula field and simple SQL we can retrieve it from the pump_definition table, we just need to match the Ids:

SELECT def.Name FROM pump_definition def WHERE def.Id = Definition

Note that 'Definition' here is the column (foreign key) in the old Pump table.

For Capacity we can do something quite similar, so the HBM needs to be adjusted to the following:

Actual legacy formulas
    <property name="DefinitionName" formula="( SELECT def.Name FROM pump_definition def WHERE def.Id = Definition)"/>
    <property name="Capacity" formula="( SELECT def.Capacity FROM pump_definition def WHERE def.Id = Definition)"/>

The resulting cleaned-up SQL looks like this:

SELECT pump.NetworkId, pump.Id, pump.Id, pump.Name, 
   ( SELECT def.Name FROM pump_definition def WHERE def.Id = pump.Definition), 
   ( SELECT def.Capacity FROM pump_definition def WHERE def.Id = pump.Definition) 
FROM Pump pump 
WHERE pump.NetworkId=...

This should have concluded the series on using hbms for backward compatibility, however I will do at least one more post, regarding class inheritance. For source code see first post in series.

This is part two in the series describing how to write a 'legacy'/'migrating' hbm for backward compatibility.

The refactoring in this post is the splitting of a class 'Bridge'. See the class diagram:

Originally there was one big Bridge class, but later it was decided to split it into a Bridge and a BridgeDefinition. Also the properties don't exactly match:

  • SideArea: no longer exists
  • Pillars -> renamed to NumPillars
  • Color: should always be 'RED'
  • IsOpen: new

Here are the old HBM (for reference) and the new HBM:

Old HBM
  <class name="Bridge">
    <id name="Id"> <generator class="guid" /> </id>
    <property name="Name" />
    <property name="Color" />
    <property name="HasRoad"/>
    <property name="Width"/>
    <property name="Height"/>
    <property name="SideArea"/>
    <property name="Type"/>
    <property name="Pillars"/>
  </class>
New HBM
  <class name="Bridge">
    <id name="Id"> <generator class="guid" /> </id>
    <property name="Name" />
    <property name="Color" />
    <property name="HasRoad"/>
    <property name="IsOpen"/>
    <many-to-one name="Definition" cascade="all-delete-orphan"/>
  </class>

  <class name="BridgeDefinition">
    <id name="Id">
      <generator class="guid" />
    </id>
    <property name="Type"/>
    <property name="Width"/>
    <property name="Height"/>
    <property name="NumPillars"/>
  </class>

We need to tackle four property changes and the class split itself. Let's start with the property changes:

Property

Change

Solution

SideArea

No longer exists

Do nothing!

Pillars

Renamed to NumPillars

<property name="NumPillars" formula="Pillars"/>

Color

Should always be 'RED' when loading old data

<property name="Color" formula="'RED'"/>

IsOpen

New property, should always be 'true' when loading old data

<property name="IsOpen" formula="1"/>

Finally we need to tackle how to split the class. Fortunately NHibernate has something called components, where it persists two classes into one table. We just tell NHibernate to threat the table as such a table, with BridgeDefinition being a 'component' of Bridge and NHibernate will load the single table into two entities, just like we want. The resulting HBM looks like this:

Migrating HBM
  <class name="Bridge">
    <id name="Id"> <generator class="guid" /> </id>
    <property name="Name" />
    <property name="Color" formula="'RED'"/>
    <property name="HasRoad"/>
    <property name="IsOpen" formula="1"/>

    <component name="Definition" class="BridgeDefinition">
      <property name="Type"/>
      <property name="Width"/>
      <property name="Height"/>
      <property name="NumPillars" formula="Pillars"/>
    </component>
  </class>

Note that the Id property of the BridgeDefinition component is not mapped. However when saving this class in the new session (with the new hbm's), it will receive an Id anyway.

The resulting SQL:

SELECT bridge.Id, bridge.Name, bridge.HasRoad, bridge.Type, bridge.Width, bridge.Height, 'RED', 1, bridge.Pillars FROM Bridge bridge WHERE bridge.Id=...

Note that the splitting into a component is handled by NHibernate in code and has no effect on the SQL.

Hopefully this post shows that what appears to be a more difficult refactoring, turns out to be pretty easily mapped. See previous post for SVN link to source code. Next up: class merge.

In the upcoming posts I will give three examples of how to implement backward compatibility using hbm mappings.

In this post I start with a simple refactoring: a 'Product' class for which a boolean property was renamed and, as a bonus, also negated. See the class diagram for an overview:

So, originally a product was marked as 'Available', but it was later decided it would be more logical to use a 'Discontinued' flag. This is not just a simple rename! We need to do something equivalent to 'Discontinued = !Available' when loading products from the old database. Specifically, we need to write a nhibernate HBM mapping file which reads the old database format into the new object model.

The old database format is as follows:

And you can imagine the original HBM with which this was saved would look something like this:

Old HBM
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" assembly="..." namespace="...">
  <class name="Product">
    <id name="Id"> <generator class="guid" /> </id>
    <property name="Name" />
    <property name="Category" />
    <property name="Available"/>
  </class>
</hibernate-mapping>

The new HBM looks like this:

New HBM
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" assembly="..." namespace="...">
  <class name="Product">
    <id name="Id"> <generator class="guid" /> </id>
    <property name="Name" />
    <property name="Category" />
    <property name="Discontinued"/>
  </class>
</hibernate-mapping>

To migrate from the old database to the new object model, we must create another, special mapping. This mapping will only be used for loading, not for saving, so we can use some special features like embedding SQL queries into the HBM.

Let's start by writing the basics. We can map most properties 1-to-1, as you can see here:

Migrating HBM
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" assembly="..." namespace="...">
  <class name="Product">
    <id name="Id"> <generator class="guid" /> </id>
    <property name="Name" />
    <property name="Category" />
    <property name="Discontinued" ????what to do here???? />
  </class>
</hibernate-mapping>

Less trivial is how to retrieved the correct value for the 'Discontinued' property. We must keep in mind that we can only map to new properties, so we cannot have a property with name="Available" here. The 'Available' value is however still there in the database, so in this case we can solve it by using the 'formula' field inside a property tag, to insert SQL queries and logic to get values from the database.

Specifically, we can do the following:

    <property name="Discontinued" formula="( Available = 0 )"/>

To understand why this works, it is important to understand how NHibernate processes these formulas. NHibernate inserts the formula as a subquery into the larger SQL. This is the cleaned-up version of what NHibernate generates:

SELECT product.Id, product.Name, product.Category, ( product.Available = 0 ) FROM Product product WHERE product.Id=...

Since 'Available' is a boolean value stored as integer, we do a '=0' compare to inverse the boolean value. Also notice that NHibernate will insert 'product.' in front of 'Available', since it recognizes it as a column in the database.

We could also have inserted an entire SQL query, for example:

    <property name="Discontinued" formula="SELECT p.Available = 0 FROM Product p WHERE p.Id = Id"/>

The resulting (cleaned-up) SQL is as you would expect:

SELECT product.Id, product.Name, product.Category, ( SELECT p.Available = 0 FROM Product p WHERE p.Id = product.Id ) FROM Product product WHERE product.Id=...

The result is equivalent (although quite a bit longer), but hopefully this shows how powerful subqueries can be. In the formula field you could also retrieve values from entirely different tables, combine columns, do simple math, or set default values.

In upcoming posts I will show examples of backward compatibility for slightly more complex refactorings; class merging and class splitting. Most of that will be based on retrieving values with the formula field, but also using some other techniques. The source code for these three (and possibly more) testcases can be found on SVN: https://repos.deltares.nl/repos/delft-tools/trunk/shared/NHibernateBackwardsCompatibility

Backwards compatibility

This post describes a design for implementing backwards-compatibility in deltashell.

Backwards-compatibility : The ability to open project files written by an older version of the application in the current application.

Global idea / setup

The idea is to use custom hbm.xml mapping to convert older project databases to the current object model. For example a plugin has made a release 1.0 which should be supported. Now if the mappings change during further development something has to be done to read the old projects. We need specific mappings to read the old files next to the current mappings. The situation would be as follows:

So the Person.hbm.1.0.xml describes the mapping of files written with 1.0 to the current version of the person. This mapping wil change as the Person evolves and we still want to support read old project files. NHibernate has a rich feature-set to get the data from the old schema to the new objects such as formulas, custom sql, usertypes etc. Another advantage of this approach is that backwards compatibility is solved using stand NHibernate technology and not DS specific stuff.

What happens when an old project gets openened?

If Deltashell opens an old project it creates a session for that version of the DB. After that all objects are migrated to a new session with the current configuration. Then when the project is saved the database is in a new format.

How does DS know which mapping to use?

Each project database contains a table with plugin version information with which the db was written. The version is the same as used for the plugin assembly. This could look like this:

Component

Release Version

Framework

1.0.0

Plugin1

0.6.0

Plugin2

1.1.0

Now development continues and breaking changes occur. This could look like this :

When the project is saved the following information is entered in the versions table

Component

Version

Framework

1.2

Plugin1

0.6.1

Plugin2

1.1

Now when that project is loaded the information is this:

Component

Version

Framework

1.3

Plugin1

0.6.1

Plugin2

1.2

Both framework and plugin2 have new versions. DeltaShell has to find to 1.2 to 1.3 mappings for framework and 1.1 to 1.2 mappings for plugin2. Since not all entity mappings change the logic is as follows.

Per mapping file
This section and the implemented logic has been updated June 2012
The rule per mapping file is simple:

  • Take the first mapping file with a version equal or more recent than the persisted version.

    Revision / build version numbers are ignored; 3.0.5 is treated as 3.0. The idea behind this is that file format changes should only occur in major or minor versions.

More elaborate, in a chart:

So how does this affect plugin development?

If you release a version which you want to support you should include a test reading a project file of that version. So when you release your plugin 1.5 you should write a test called ReadVersion1_5 in your plugin that reads a project that covers all possible mappings you have in your plugin. You should check you get everything back from the project as you expect it. The dsproj file is checked into svn and the test should work as nothing changed yet. All is well....

After a while you want to refactor your class A. You also change the mapping A.hbm.xml. But the test ReadVersion1_5 you wrote on release now fails because reading the old db with the new a.hbm.xml does not work! What to do? You can chose not to refactor A and do something else OR you could add a A.1.5.hbm.xml mapping that converts the old db to the new object. Once you check in that new mapping file *ReadVersion1.5 should use that specific file (instead of A.hbm.xml) and the test should pass again.

So for a plugin it boils down to 4 points:

1 Write a ignored (not run on BS) test that can create a dsproj file with all persistent objects you have in your plugin. You create a 'representative' project here. Update this test as your plugin evolves and add the category [BackwardCompatibility]

2 If you have a release with file-format changes run test 1) and copy the result to the release version. Write a test that reads in that project file and add the category [BackwardCompatibility]

For example if you release 1.1 you should run the test of 1) and copy the result to project1_1.dsproj + data. Check in this project and write a test ReadVersion1_1 that reads this project. You will never change this checked in project again.

3 If the test of 2) breaks add version specific mappings until the test passes again.

4 If you longer support a version delete the test and the mapping you created for it.

NHibernate delete logic

Delete logic

This is a WorkInProgress description of the NHProjectRepository logic in DS.

NHProjectRepository uses a custom method te determine which objects can be deleted from the database. Unfortunately NHibernate can not do this because NHibernate cascades are evaluated very locally. See for example http://fabiomaulo.blogspot.com/2009/09/nhibernate-tree-re-parenting.html Basically NHibernate can delete object B if object A is deleted but it cannot check if object B is in use elsewhere.

The solution:

S - P = O 

SessionObjects minus ProjectObjects are Orphans. So everything that is in session but not in project can be deleted. Now our save logic look liks this

1) Determine session objects using NHibernate session's entityentries

2) Determine project objects using EntityWalker (more about this later)

3) Determine orphaned objects by substracting 1 -2 and delete them from the session 

4) Flush the session.

EntityWalker and EntityMetaModel

EntityWalker is a class that based on a EntityMetaModel can traverse a object tree. EntityMetaModel describes the relationship between objects and the strength of the relationship _(composition vs aggregation). The EntityMetaModel can be determined by looking at NHibernate's configuration and this is done by the NHibernateMetaModelProvider. Using this EMM the EntityWalker can create a set of objects which are reachable from the project root. The EntityWalker uses only Composite relationships to traverse the object tree. So something that is only reachable by aggregate relationships is not considered in project (and will be considered an orphan if is in session). 

NHibernateMetaModelProvider

This class is responsible for creating a EntityMetaModel based on a NHibernate configuration. It uses NHibernate's cascade styles to determine the strength of the relationship. Here the rule of thumb is that when a cascade includes a delete the relation is composition and otherwise it is a mere aggregate relationship. So

  • All -> Composite
  • All-Delete-Orphan -> Composite
  • Others -> Aggregation

Customizing NHibernate deletes with DeltaShellDeleteEventListener

NHibernate still does cascade on relationships and this can be a problem. For example when a object is reparented. In the sample below object A was a part-of (composition) D1 but it moved to D2. All relationships below are composite. Now if the cascade of the mapping between D1 and A is all NHibernate will delete A because D1 is deleted. This is obviously not what we want because D2 uses A now. This will result in a 'deleted object will be resaved by cascade' exception. To prevent this we need to intercept Nhibernate's deletes and verify that the object really should be deleted. This can be done by checking if the object that NH wants to delete is a orphan. This is done replacing Nhibernate's DefaultDeleteListener with our own and overriding the OnDelete method. Here a check is done with the current orphans list. If the entity-to-delete is an orphan the delete is processed further. Otherwise it is cancelled. Note: Using the IPreDeleteEventListener did not work because cascade and handled first and these eventhandlers are fired very late in the pipe-line. Hence the override on this class.

Further reading

Fowler about aggregate vs composition :http://martinfowler.com/bliki/AggregationAndComposition.html
NHibernate cascades : http://ayende.com/blog/1890/nhibernate-cascades-the-different-between-all-all-delete-orphans-and-save-update