NHibernate's concept of 'inverse' in relationships is probably the most often discussed and misunderstood mapping feature. When I was learning NHibernate, it took me some time to move from "I know where should I put 'inverse' and what then happens" to "I know why do I need 'inverse' here and there at all". Also now, whenever I'm trying to explain inverses to somebody, I find it pretty hard.
There are a lot of explainations over the net, but I'd like to have my own one. I don't think that the others are wrong, it'll just help me arrange my own understanding and if anyone else take advantage of this, that's great.
Where do we use inverse?
First, some widely-known facts, next we'll elaborate on few of them.
- Inverse is a boolean attribute that can be put on the collection mappings, regardless of collection's role (i.e. within one-to-many, many-to-many etc.), and on join mapping.
- We can't put inverse on other relation types, like many-to-one or one-to-one.
- By default, inverse is set to false.
- Inverse makes little sense for unidirectional relationships, it is to be used only for bidirectional ones.
- General recommendation is to use inverse="true" on exactly one side of each bidirectional relationship.
- When we don't set inverse, NHProf will complain about superfluous updates.
What does it mean for a collection to be 'inverse'?
The main problem in understanding 'inverse' is it's negating nature. We're not used to setting something up in order to NOT take an action. Inverse set to true means "I do NOT maintain this relationship". Hence, inverse set to false means "I DO maintain this relationship".
It'll be much more understandable if we could go to the opposite side of the relationship and be positive there: "This side maintains the relationship" and NHibernate would automatically know that the other side doesn't (*). But it is implemented as it is - we have to live with inverse's negative character.
Each relationship is represented in the database as an identifier of a related table row in the foreign key column at 'many' side. Why at 'many' side? Because that's how we do relationships in the relational databases. The column "holding" the association is always at 'many' side. It's not possible to keep the association at 'one' side because we'd have to insert many values into one database field somehow.
So what does it mean for a collection in NHibernate to maintain the relationship (inverse="false")? It means to ensure that the relation is correctly represented in the database. If the Comments collection in the Post object is responsible for maintaining the relationship, it has to make sure all its elements (comments) have foreign keys set to post's id. In order to do that, it issues a SQL UPDATE statement for each Comment, updating its Post reference. It works, the relationship is persisted correctly, but these updates often do not change anything and can be skipped (for performance reasons).
Inverse="true" on a collection means that it should not take care whether the foreign keys in the database are properly set. It just assumes that some other party will take care of it. What do we gain? We have no superfluous UPDATE statements. What can we lose? We have to be sure that the second side actually takes over the responsibility of maintaining the association. If it doesn't, nobody will and we'll be surprised that our relationship is not persisted at all (NHibernate will not throw an error or so, it won't guess that it's not what we've expected).
When should we set inverse="true"?
Let's consider one-to-many first. Our relationship must be bidirectional and have entities (not value types) at both sides for inverse to make sense. Other side ('many' side) is always active, we can't set inverse on many-to-one. This means that we should put inverse="true" on the collection, provided that:
- our collection is not explicitly ordered (like <list>) - it is i.e. <bag> or <set>; ordered lists have to be active in order to maintain the ordering correctly; 'many' side doesn't know anything about the ordering of collection at 'one' side
- we actually set the relationship at 'many' side correctly
Consider the example:
public class Post
{
public virtual int Id { get; set; }
public virtual ICollection<Comment> Comments { get; set; }
}
public class Comment
{
public virtual int Id { get; set; }
public virtual Post Post { get; set; }
public virtual string Text { get; set; }
}
// ...
var comment = new Comment() { Text = "the comment" };
session.Persist(comment);
post.Comments.Add(comment);
We are not setting Post property in Comment class as we may expect NHibernate will handle that as we append our comment to the collection of comments in particular Post object (**). If the post.Comments collection is not inverse, it will actually happen, but quite ineffectively:
We've inserted null reference first (exactly as it was in our code) and then, as the collection is responsible for maintaining the relationship (inverse="false"), the relationship was corrected by separate UPDATE statement. Moreover, in case we have not null constraint on Comment.Post_id (which is actually good), we'll end up with exception that we can't insert null foreign key value.
Let's see what happens with inverse="true":
There's no error, but the comment is actually not connected to the post, despite we've added it to a proper collection. But using inverse, we've explicitly turned off maintaining the relationship by that collection. And as we don't set the relationship onComment side, noone does.
The solution of course is to explicitly set comment's Post property. It is good from object model perspective, too, as it reduces the amount of magic in our code - what we've set is set, what we haven't set is not set magically.
var comment = new Comment() { Text = "the comment", Post = post };
session.Persist(comment);
post.Comments.Add(comment);
Much better now:
Time for many-to-many. Again, inverse makes sense only when we've mapped both sides. We have to choose one side which is active and mark the second one as inverse="true". Without that, when both collections are active, both try to insert a tuple to an intermediate table many-to-many needs. Having duplicated tuples makes no sense in most cases. For some suggestions how to choose which side is better in being active, see my post from December.
To sum up
Left side | Right side | Inverse? |
---|---|---|
one-to-many | not mapped | makes no sense - left side must be active |
one-to-many | many-to-one | right side should be active (left with inverse="true"), to save on UPDATEs (unless left side is explicitly ordered) |
many-to-many | not mapped | makes no sense - left side must be active |
many-to-many | many-to-many | one side should be active (inverse="false"), the other should not (inverse="true") |
______
(*) There are of course reasons why NHibernate doesn't do assumptions about other sides of relationships like that. The first one is to maintain independence between mappings - it will be cumbersome if change in mapping A modifies the B behaviour. The second one are ordered collections, like List. The ordering can be automatically kept by NHibernate only when collection side is active (inverse="false"). If the notion of being active is managed on the other side only, changing the collection type from non-ordered to ordered would require changes in both mappings.
(**) Note that inverse is completely independent from cascading. We can have cascade save on collection and it does not affect which side is responsible for managing the relationship. Cascade save means only that when persisting Post object, we're also persisting all Comments that were added to the collection. They are inserted with null Post value and UPDATEd later or inserted with proper value in single INSERT, depending on object state and inverse setting, as described above.