@BatchSize a smart or stupid use?
- Yes,
@BatchSize
is meant to be used with lazy associations. - Hibernate will execute multiple statements in most sitations anyway, even if the count of uninitialized proxies/collections is less than the specified batch size. See this answer for more details. Also, more lighter queries compared to less bigger ones may positively contribute to the overall throughput of the system.
-
@BatchSize
on class level means that the specified batch size for the entity will be applied for all@*ToOne
lazy associations with that entity. See the example with thePerson
entity in the documentation.
The linked question/answers you provided are more concerned about the need for optimization and lazy loading in general. They apply here as well of course, but they are not related to batch loading only, which is just one of the possible approaches.
Another important thing relates to eager loading which is mentioned in the linked answers and which suggests that if a property is always used then you may get better performance by using eager loading. This is in general not true for collections and in many situations for to-one associations either.
For example, suppose you have the following entity for which bs
and cs
are always used when A
is used.
public class A {
@OneToMany
private Collection<B> bs;
@OneToMany
private Collection<C> cs;
}
Eagerly loading bs
and cs
obviously suffers from N+1 selects problem if you don't join them in a single query. But if you join them in a single query, for example like:
select a from A
left join fetch a.bs
left join fetch a.cs
then you create full Cartesian product between bs
and cs
and returning count(a.bs) x count(a.cs)
rows in the result set for each a
which are read one by one and assembled into the entities of A
and their collections of bs
and cs
.
Batch fetching would be very optimal in this situation, because you would first read A
s, then bs
and then cs
, resulting in more queries but with much less total amount of data that is transferred from the database. Also, the separate queries are much simpler than a big one with joins and are easier for database to execute and optimize.
Walfrat
Updated on July 20, 2022Comments
-
Walfrat almost 2 years
First I'll explain how I understood and use
@BatchSize
:@BatchSize
is made in order to load relations of objects in batch, making less SQL request to the database. This is specially usefull on LAZY@OneToMany
relations.However it's even useful on LAZY
@OneToOne
relation and@ManyToOne
: if you load a list of entities from the database and ask to load a lazyed@*ToOne
entity, it will load the entities by batch even if i just use a test that load the relation of the 1st entity of the list.Note if some want to tests : This only show if the entities are not already loaded : for instance if you have a list of user with manager and list all users, when you will access to the manager, no request will be triggered since it's already loaded.
The only drawback that i see on that method is if you load a list of item from the database but only use a part of it. This is a post-filtering operation.
So let's get to the main point.
Let's assume that i make everything good to never do post-filtering-like operations even if it's makes me do native SQL queries or use DTO objects for multiselect criteria query and so on.
- Am I right to consider that I can just
@BatchSize
every lazyed relations after having carefully think about using eager loading / join and finally choose a lazy relation ? - Do i have any interest to search for an adequate value for the
@BatchSize
or can i think "the bigger the better" ? This would mean "is there any a limit of number in "IN" SQL operator that can make my request enough slower to not be worth anymore ? I use Postgres but if you have answers for others SGBD i'm interested too. - Optional question : it seems that using
@BatchSize
on a class isn't producing a lot of results. I still have to annotate every lazy relationships, did i miss something about it or is it useless ?
EDIT : The point of my 3 is that i'm getting a different behaviour.
Let say i'm loading a list of entities of class "A" which has a LAZY OneToMany relationship to B. Now i want to print all creationDate of B. So i'm doing a classic 2 for loop.
I annotated B with BatchSize now :
- @OneToMany is not annotated with BatchSize : each set of B are loaded on each iteration independently without batching. So my annotation on B class seems to be totally ignored. Even if i set a value to "two" and i have 6 entries in one set, i have one query for that set.
- @OneToMany is annotated : i have the specific query of batches that are loaded. If i fix the batch size to two and i have a total of 10 B accros i just get 5 requests : whatever the number of A i have. If i set it to 100 : i have 1 query for B objects.
PS : i'm not considering any related query to B that might fire to load B fields with fetch select/subselect.
EDIT 2 : i just found this post Why would I not use @BatchSize on every lazy loaded relationship? althought i googled and search on SO beforeposting my question, guess i didn't use the right words...
However i'm adding something different that might lead to a different answer : when i'm wondering about using BatchSize on every relations, it's after choosing if i want a eager loading, with join / select fetch or if i want lazy loading.
- Am I right to consider that I can just
-
Walfrat over 8 years1 - I think i read once that it wasn't a so good practice, that's why i wondered that as long i'm aware of the post-filtering like problems i can just spam it everywhere. 3 - i'm getting a different behaviour, i edited my question for more information. Cah you check this please ?
-
Dragan Bozanovic over 8 years@Walfrat Please see my edited answer. 3)
@BatchSize
on class level applies only for to-one associations with those entities. 1) What do you mean by post-filtering problems? -
Walfrat over 8 years1- BatchSize load a bunch of entities, but if your SQL/JPA/HQL,... query return a list of items from which you process in order to extract a part of it (post-filtering), you will have load in memory items that are not necessery.,Concernning your edit, it's something that i'm already aware this is why as I said that the discussion about batch-size come after choosing lazy-loading over eager. 3- Thanks for the precision, i did miss something about it so :P. But in the link you provided i don't see it clearly explained.
-
Dragan Bozanovic over 8 years@Walfrat You're welcome. Sure, there is no silver bullet that is applicable in each situation, there are always cases that may require custom handling.
-
Ruslan Stelmachenko almost 5 yearsAfter reading what
@BatchSize
on class level means, I still not understand, what@*ToOne
associations it affects. For example if@BatchSize
is placed onPerson
class, it will affect all@*ToOne
associations on fields insidePerson
class or all@*ToOne
associations on fields havingPerson
type inside any other class? After explore examples, the answer is: second.