So, I wondered, if there is threshold when using the lean() method is not that profitable or it’s a real silver bullet if you have read-only scenarios, i.e. reporting.
Solution - Performance Tests
To figured that out I decided to do the performance tests for both find() and lean() methods and then compare the nature of the data.
Here are the conditions for the experiment:
- Performance tests are done for different amount of objects to return (from 100 to 10000).
- Data model for the objects to return contained 6 fields (not heavy objects).
- Performance tests are done with 5 seconds break in between to exclude the possibility of affecting each other.
Please take a look at the resulting chart. X-axis represents the time required to complete the query. Y-axis represents the number of objects to return by the query.
Orange data - find() method. Blue data - lean() method
I also made a chart representing the dynamics of the following kpi: time for find() / time for lean(). And here it is:
Time for find() / Time for lean()
And here is the summary of the research:
- Time required for getting data from find() and lean() methods grows linearly depending on number of objects to return.
- The function representing find() method grows faster than the one for lean() method.
- The ratio time for find() / time for lean() is constant - it doesn’t really depend on the number of objects to return. What actually means, that on small numbers the profit of using lean() over find() could be vague, but on big ones it’s super vivid.
Going back to the question I set for this research if the lean() method is really a silver bullet for optimizing read-only scenarios, well, it actually is.
It dramatically increases the performance which is especially clear on bigger numbers.
For those who want to play around my research, you’re kindly welcomed to take a look at the repo.