Hibernate Mappings for Performance and Serialization

What

An example of how to do Hibernate Mapping/JPA in a manner most conducive to well-performing database queries and minimally-painful serialization.

Why

Because I seem to have this discussion way too often and this stuff is too abstract to talk about without something concrete.

Because I’ve been down this road too many times and wanted to share those learnings generally.

Because I believe in domain-driven design and development and often people shoot themselves in the foot when mapping things.

Where

https://github.com/revelfire/mappingexample

How

Basic Mapping Goals

  • Minimize the object “graph” to owned-relationships
  • Prevent stack overflow errors on serialization
  • Prevent massive descent into the object graph on serialization
  • Model the data closely to how it gets accessed
  • Keep it simple

Guiding Principles

  • No, or minimal, bidirectional relationships
  • Minimize entity relationships, maximize use of foreign keys
  • Allow services to store relationships rather than automating with instance references
  • Allow HQL/Repository loads and services to load relationships WHEN they matter

So much of this stuff is use-case based and should be considered in the context of access vs ownership, both in services and over the wire.  Please keep this in mind for the example, I will endeavor to explain the modeling choices in this light.

Ownership

OK lets talk about the example. (In case you didn’t read above: https://github.com/revelfire/mappingexample) We have here a very basic Account->User->Address set of domain/service/repository/test classes. One of the most common use cases you’re likely to find.

Notice that the Account object doesn’t actually contain a User object. User does contain an account_id.  Why?

It is generally the case that when we are interacting with a User, we don’t care about the Account (e.g. profile view/edit, access rights, password updates).  On the other hand, when we interact with an Account, we sometimes want to know a list of User (admin screens).

@Entity
@Table(name="account")
public class Account extends Identifiable {

    @Column(length = 50, nullable = false)
    private String name;
...
}
@Entity
@Table(name="user")
public class User extends Identifiable {

    @Column(length = 100, nullable = false)
    private String name;
// note that this is not a reference to Account
    @Column(name = "account_id", nullable = false)
    private Long accountId;

    /**
     * This COULD be @OneToOne as an "owned" relationship if we felt strongly
     * about having a separate table.
     *
     * This COULD be a one-many scenario in which case it would not be modeled on this
     * end, rather a repository.loadAddressForUser with address.user_id being the join point
     * via the foreign key reference.
     */
    @Embedded
    private Address address;
...
}

So – Account is an “owner” of users in the sense that without it, the users probably don’t make sense by themselves, and would go away if the account went away. So of course Account should have access to users, BUT it is one to many, so I really don’t want to cascade those changes, or even load the list of users in the general case – only when I really mean to (e.g. not when serializing). So I model it this way to create an avenue to access, but limit the ownership (delegating to the service tier).

Often times I will hear “but Chris we want to be able to easily get the list of USERS from the ACCOUNT by simply typing user.account!  Yes. Maybe you do. You are trying to be lazy but failing because you are actually causing more work for yourself down the line. You now will have to set up OpenSessionInViewFilter most likely, and that sucks. Or worse, you make it EAGER, and shoot yourself every freakin time you load it. (The ignorant developers way of solving LazyInitializationException)

How about instead you create a nice repository method for that one-off use case?

@Repository
public interface AccountRepository extends CrudRepository<Account, Long> {
@Query(“select u from User u where u.accountId = ?”)
List<User> getUsersForAccount(long accountId);
}

That wasn’t so bad, was it? Access granted.

Sometimes also there is concern about managing parent/child key relationships. To that, I make this case: You WANT to manage those. You don’t often set them, except for on create, and you are already doing that, just as an entity, not as an id explicitly. So you an id. You’ll live longer.

Also on the topic of ownership, note that Address is @Embedded into User.  You can read the comments there now because you skipped them before.  Basically, I only ever turn on cascade with @OneToOne and really the only reason to use that is to make your DBA happy or to prevent severely wide tables (which seems somewhat rare). This is more true ownership in the sense that with one loads the other, always.

What’s Wrong With Bidirectional?

Oftentimes in the documentation you will read things about mapping a bidirectional relationships. While possible, I assert that this is the wrong thing to do in 99% of the cases.  It is very rare that you need to ask about User from Address, or about Account from User, and in the cases you do, go ahead and load that entity by the id. Your code will be cleaner and you will have fewer errors.

Also, entity<->entity mapping and bidirectional mapping will often cause serialization to go bonkers and StackOverflowError. This is bad. Of course, you could @JsonIgnore that relationship or @JsonBackreference or whatever – but now you are most likely hacking in the solution. The (unnecessary) mapping itself is the problem.

What About Many to Many?

In many to many there is almost always no ownership. Just access, and we have different methods for creating availability.

I don’t have this use case in this example (yet). I may add it. Simple answer is, with many to many you are probably better off still letting JPA implementation manage this relationship.  However, you should be careful to disable cascading behaviors,

e.g.

@JsonIgnore
@ManyToMany(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
 @JoinTable(
      name = "country_states", 
      joinColumns = { @JoinColumn(name = "STATE_ID", nullable = false, updatable = false) }, 
      inverseJoinColumns = { @JoinColumn(name = "COUNTRY_ID", nullable = false, updatable = false) })
public Set<States> getStates() {
  return this.categories;
}

I find I don’t run into this use case too often “in the wild” and typically wind up doing things in the services, or using (again) repository queries for these types of many-many loads. There is, however, much value of @ManyToMany in managing the join table when the case does arise.

And Finally

If you take these approaches you will find your database performs better (far fewer automated joins, access to data when you need it, not just because hibernate is a dumb animal), your REST calls are smoother (serialization doesn’t wreak havoc), you don’t need OpenSessionInViewFilter (slowing things down and locking up connections in the pool), and ultimately you start to model your API endpoints a little differently. They really ought to be resource based, and granular, unless performing some larger non-CRUD unit of work which, lets face it, is service backed anyway (not automated via JPA). Right?