Preventing N+1 SELECT problem using Spring Data JPA EntityGraph

1. OVERVIEW

What’s the N+1 SELECT problem?

The N+1 SELECT problem happens when an ORM like Hibernate executes one SQL query to retrieve the main entity from a parent-child relationship and then one SQL query for each child object.

The more associated entities you have in your domain model, the more queries will be executed. The more results you get when retrieving a parent entity, the more queries will be executed. This will impact the performance of your application.

Prevent N+1 SELECT problem in Hibernate

This blog post helps you understand what the N+1 SELECT problem is and how to fix it for Spring Boot applications using Spring Data JPA Entity Graph.

Tip: You might want to take a look at the recent Code and Configuration Snippets / Recipes / Tips section, which helps you with common tasks without reading through a full blog post.

2. DATABASE TABLES

Let’s start with a partial ER diagram for the db_dvdrental relational database and its JPA mappings:

Entity Relationship Diagram - db_dvdrental DVD Rental Entity Relationship Diagram

film, actor, film_actor, film_category, and language are the relevant tables which I’ll cover in this post.

Tip: Read on to learn how to document your relational database using SchemaSpy.

3. JPA ENTITIES

Let’s now generate the JPA entities mapped to these tables using a Maven plugin:

Film.java:

@Entity
@Table(name = "film", schema = "public")
...
public class Film implements Serializable {

  @ManyToOne(fetch = FetchType.LAZY)
  @JoinColumn(name = "language_id", nullable = false)
  private Language language;

  @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "film")
  private Set<FilmActor> filmActors;
...
}

FilmActor.java:

@Entity
@Table(name = "film_actor", schema = "public")
...
public class FilmActor implements Serializable {

  @ManyToOne(fetch = FetchType.LAZY)
  @JoinColumn(name = "film_id", nullable = false, insertable = false, updatable = false)
  private Film film;

  @ManyToOne(fetch = FetchType.LAZY)
  @JoinColumn(name = "actor_id", nullable = false, insertable = false, updatable = false)
  private Actor actor;
...
}

Actor.java:

@Entity
@Table(name = "actor", schema = "public")
...
public class Actor implements Serializable {

  @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "actor")
  private Set<FilmActor> filmActors;
...
}

Language.java:

@Entity
@Table(name = "language", schema = "public")
...
public class Language implements Serializable {

  @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "language")
  private Set<Film> films;
...
}

Notice the @OneToMany, @ManyToOne entity associations are LAZY loaded.

Important: EAGER fetching associations are simple to use but this strategy might load irrelevant data to your business use case. You’ll likely end up including many JOIN clauses in a SELECT statement impacting your application performance.

4. SPRING DATA JPA REPOSITORIES - N+1 SELECT PROBLEM

Let’s start with the simplest repository implementation using Spring Data JPA to perform CRUD operations for Film JPA entities:

FilmDao.java:

@Repository
public interface FilmDao extends JpaRepository<Film, Integer> {
}

And that is it. For now.

To reproduce the N+1 SELECT problem using the accompanying source code, let’s first understand the execution flow:

Retrieve films Sequence Diagram

The methods relevant to this post are:

FilmDao.findAll(page): Retrieves films from the RDBMS.
FilmResourceMapper.map(films): Maps from the internal domain to the API contracted response.

Sending a GET request to /api/films outputs these logs:

Note: You can enable Hibernate logs setting these properties (yaml-formatted):
logging:
  level:
    org.hibernate.type.descriptor.sql.BasicBinder: TRACE

spring:
  jpa:
    show-sql: true
    properties:
      hibernate:
        format_sql: true

Hibernate: 
    select
        film0_.film_id as film_id1_8_,
        film0_.description as descript2_8_,
        film0_.fulltext as fulltext3_8_,
        film0_.language_id as languag13_8_,
        film0_.last_update as last_upd4_8_,
        film0_.length as length5_8_,
        film0_.rating as rating6_8_,
        film0_.release_year as release_7_8_,
        film0_.rental_duration as rental_d8_8_,
        film0_.rental_rate as rental_r9_8_,
        film0_.replacement_cost as replace10_8_,
        film0_.special_features as special11_8_,
        film0_.title as title12_8_ 
    from
        public.film film0_ limit ?
Hibernate: 
    select
        count(film0_.film_id) as col_0_0_ 
    from
        public.film film0_
Hibernate: 
    select
        filmactors0_.film_id as film_id2_9_0_,
        filmactors0_.actor_id as actor_id1_9_0_,
        filmactors0_.actor_id as actor_id1_9_1_,
        filmactors0_.film_id as film_id2_9_1_,
        filmactors0_.last_update as last_upd3_9_1_ 
    from
        public.film_actor filmactors0_ 
    where
        filmactors0_.film_id=?
2020-10-27 13:40:24.589 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [1]
Hibernate: 
    select
        actor0_.actor_id as actor_id1_0_0_,
        actor0_.first_name as first_na2_0_0_,
        actor0_.last_name as last_nam3_0_0_,
        actor0_.last_update as last_upd4_0_0_ 
    from
        public.actor actor0_ 
    where
        actor0_.actor_id=?
2020-10-27 13:40:24.599 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [198]
Hibernate: 
    select
        actor0_.actor_id as actor_id1_0_0_,
        actor0_.first_name as first_na2_0_0_,
        actor0_.last_name as last_nam3_0_0_,
        actor0_.last_update as last_upd4_0_0_ 
    from
        public.actor actor0_ 
    where
        actor0_.actor_id=?
2020-10-27 13:40:24.600 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [188]
....
2020-10-27 13:40:24.604 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [30]
Hibernate: 
    select
        language0_.language_id as language1_13_0_,
        language0_.last_update as last_upd2_13_0_,
        language0_.name as name3_13_0_ 
    from
        public.language language0_ 
    where
        language0_.language_id=?
2020-10-27 13:40:24.605 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [1]
Hibernate: 
    select
        filmactors0_.film_id as film_id2_9_0_,
        filmactors0_.actor_id as actor_id1_9_0_,
        filmactors0_.actor_id as actor_id1_9_1_,
        filmactors0_.film_id as film_id2_9_1_,
        filmactors0_.last_update as last_upd3_9_1_ 
    from
        public.film_actor filmactors0_ 
    where
        filmactors0_.film_id=?
2020-10-27 13:40:24.606 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [2]
Hibernate: 
    select
        actor0_.actor_id as actor_id1_0_0_,
        actor0_.first_name as first_na2_0_0_,
        actor0_.last_name as last_nam3_0_0_,
        actor0_.last_update as last_upd4_0_0_ 
    from
        public.actor actor0_ 
    where
        actor0_.actor_id=?
2020-10-27 13:40:24.608 TRACE 25609 --- [nio-8080-exec-1] o.h.type.descriptor.sql.BasicBinder      : binding parameter [1] as [INTEGER] - [85]
...

Hibernate first executes a SELECT statement to retrieve all films.
For each parent entity Film, Hibernate executes another SELECT statement to retrieve all its FilmActor associated entities (@OneToMany).
And for each FilmActor entity retrieved, Hibernate executes another SELECT statement to retrieve its Actor child entity (@ManyToOne).
But there is more, for each Film, Hibernates also executes yet another SELECT statement to retrieve the Film’s Language (@ManyToOne).

To illustrate the N+1 SELECT problem with an example:

If you retrieve two films, one with five actors, in English and the second film with eight actors, in Spanish, Hibernate ends up executing:

One SELECT to retrieve both films.
Two SELECT to retrieve film actors associations, one for each film.
Five SELECT to retrieve actors for the first film.
Eight SELECT to retrieve actors for the second film.
One SELECT to retrieve English language for the first film.
One last SELECT to retrieve Spanish language associated to the second film.

These add up to eighteen SELECT statements. A very inefficient approach to retrieve associated JPA entities. A lot of round trips to the RDBMS. You’ll very-likely have performance problems once your traffic starts to grow.

5. HIBERNATE’S LazyInitializationException

Before showing how to avoid the N+1 SELECT problem using @EntityGraph, let’s ask a question.

Where were these SELECT statements executed?

The first SELECT from the film table statement is executed via the FilmDao.findAll(page) method. The other SELECT statements are executed when the REST controller converts the Film entities to REST resources using FilmResourceMapper.

Warning:
Don’t expose your internal domain as your external interface. It might lead to API compatibility issues.
The application’s internal domain might evolve at a different pace than the contracted API responses.
Map or transform the internal domain and send the result as the API response.

But the database transaction is committed after the execution of DvdRentalService.retrieveFilms() as a result of its @Transactional annotation, how does the mapping even work?

It works as a consequence of Spring Boot’s behavior of setting spring.jpa.open-in-view=true by default. This approach keeps the Hibernate Session open after a database transaction has been committed.

Important: The Open Session in View or Open Entity Manager in View is considered an anti-pattern.

Once you set spring.jpa.open-in-view=false, which is the recommended practice, sending the same GET request to /api/films fails.

...
2020-10-23 18:12:44.693 ERROR 20180 --- [nio-8080-exec-1] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.asimio.demo.domain.Film.filmActors, could not initialize proxy - no Session] with root cause

org.hibernate.LazyInitializationException: failed to lazily initialize a collection of role: com.asimio.demo.domain.Film.filmActors, could not initialize proxy - no Session
...

The Entity Manager (wraps the Hibernate Session) is closed after executing DvdRentalService.retrieveFilms(). The JPA entities are in a DETACHED state and navigating through the filmActors or any other child entity throws LazyInitializationException.

How do you avoid the N+1 SELECT problem whitout using the Open Entity Manager in View pattern and preventing the LazyInitializationException?

You would have to eagerly-fetch the associated entities of the parent JPA entity. A few ways to do it are:

SQL JOIN clauses if you are using native queries.
FETCH JOIN clauses if you are using JPQL or Criteria API.
Spring Data JPA’s @EntityGraph if you are implementing your repositories using Spring Data JPA. The next section helps you with this option.

6. SPRING DATA JPA REPOSITORIES - @EntityGraph

@EntityGraph, added in JPA 2.1 allows you to define which relations or associated entities to load for your specific use cases.

Let’s now update the Spring Data JPA repository:

FilmDao.java:

@Repository
public interface FilmDao extends JpaRepository<Film, Integer> {

  @EntityGraph(
    type = EntityGraphType.FETCH,
    attributePaths = {
      "language",
      "filmActors",
      "filmActors.actor"
    }
  )
  Page<Film> findAll(Pageable pageable);
...
}

When retrieving Films using FilmDao.findAll(page), the @EntityGraph annotation causes Spring Data JPA and Hibernate to fetch the associated entities included in attributePaths according to the EntityGraphType specified, FETCH, or LOAD.

Type	Description
EntityGraphType.FETCH	The attributes included in attributePaths are treated as FetchType.EAGER. Other attributes found in the JPA entities that are not included in attributePaths are treated FetchType.LAZY.
EntityGraphType.LOAD	The attributes included in attributePaths are treated as FetchType.EAGER. Other attributes found in the JPA entities that are not included in attributePaths are treated according to their specified or default FetchType.

Notice attributesPath can also include the nested attributes of a JPA entity collection. filmActors is a collection of FilmActor. And @EntityGraph includes the actor associated entity via filmActors.actor.

This time sending a GET request to /api/films logs:

Hibernate: 
  select
    film0_.film_id as film_id1_8_0_,
    language1_.language_id as language1_13_1_,
    filmactors2_.actor_id as actor_id1_9_2_,
    filmactors2_.film_id as film_id2_9_2_,
    actor3_.actor_id as actor_id1_0_3_,
    film0_.description as descript2_8_0_,
    film0_.fulltext as fulltext3_8_0_,
    film0_.language_id as languag13_8_0_,
    film0_.last_update as last_upd4_8_0_,
    film0_.length as length5_8_0_,
    film0_.rating as rating6_8_0_,
    film0_.release_year as release_7_8_0_,
    film0_.rental_duration as rental_d8_8_0_,
    film0_.rental_rate as rental_r9_8_0_,
    film0_.replacement_cost as replace10_8_0_,
    film0_.special_features as special11_8_0_,
    film0_.title as title12_8_0_,
    language1_.last_update as last_upd2_13_1_,
    language1_.name as name3_13_1_,
    filmactors2_.last_update as last_upd3_9_2_,
    filmactors2_.film_id as film_id2_9_0__,
    filmactors2_.actor_id as actor_id1_9_0__,
    actor3_.first_name as first_na2_0_3_,
    actor3_.last_name as last_nam3_0_3_,
    actor3_.last_update as last_upd4_0_3_ 
  from
    public.film film0_ 
  left outer join
    public.language language1_ 
      on film0_.language_id=language1_.language_id 
  left outer join
    public.film_actor filmactors2_ 
      on film0_.film_id=filmactors2_.film_id 
  left outer join
    public.actor actor3_ 
      on filmactors2_.actor_id=actor3_.actor_id

A single SELECT statement to retrieve Films and its associated JPA entities via JOIN clauses.

7. CONCLUSION

The N+1 SELECT issue will impact the performance of your Java application.

If you are already using Spring Data JPA to interact with a relational database, you can take advantage of its support for JPA’s @EntityGraph.

Spring Data JPA’s own @EntityGraph is easy to add to your existing repositories. It helps to execute only one SELECT statement preventing the N+1 SELECT problem.

8. UPCOMING

Lets take a closer look at the Hibernate logs this request generates:

WARN 84147 --- [nio-8080-exec-1] o.h.h.internal.ast.QueryTranslatorImpl   : HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!

HHH000104: firstResult/maxResults specified with collection fetch; applying in memory! might be a huge problem depending on the size of your dataset. Stay tuned, I’ll cover how to fix Hibernate HHH000104 firstResult/maxResults warning using Spring Data JPA.

Thanks for reading and as always, feedback is very much appreciated. If you found this post helpful and would like to receive updates when content like this gets published, sign up to the newsletter.