[MAGNOLIA-8760] Usage of PostgreSQL to publish heavy data loads increases toast tables size too much Created: 13/Feb/23 Updated: 04/Jul/23 Resolved: 08/Mar/23 |
|
| Status: | Closed |
| Project: | Magnolia |
| Component/s: | None |
| Affects Version/s: | 6.2.28 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Roberto Gaona | Assignee: | Daniel Alonso |
| Resolution: | Done | Votes: | 0 |
| Labels: | cs-bk | ||
| Σ Remaining Estimate: | Not Specified | Remaining Estimate: | Not Specified |
| Σ Time Spent: | Not Specified | Time Spent: | Not Specified |
| Σ Original Estimate: | Not Specified | Original Estimate: | Not Specified |
| Attachments: |
|
|||||||||||||||||||||||||
| Issue Links: |
|
|||||||||||||||||||||||||
| Sub-Tasks: |
|
|||||||||||||||||||||||||
| Template: | ||||||||||||||||||||||||||
| Acceptance criteria: |
Empty
|
|||||||||||||||||||||||||
| Task DoD: |
[X]*
Doc/release notes changes? Comment present?
[X]*
Downstream builds green?
[X]*
Solution information and context easily available?
[X]*
Tests
[X]*
FixVersion filled and not yet released
[ ] 
Architecture Decision Record (ADR)
|
|||||||||||||||||||||||||
| Bug DoR: |
[X]*
Steps to reproduce, expected, and actual results filled
[X]*
Affected version filled
|
|||||||||||||||||||||||||
| Release notes required: |
Yes
|
|||||||||||||||||||||||||
| Documentation update required: |
Yes
|
|||||||||||||||||||||||||
| Date of First Response: | ||||||||||||||||||||||||||
| Visible to: |
Christian Menzel, Fabian Mangold, Jan Haderka
|
|||||||||||||||||||||||||
| Epic Link: | Performance problems with too many child nodes | |||||||||||||||||||||||||
| Sprint: | Nucleus 31 | |||||||||||||||||||||||||
| Story Points: | 5 | |||||||||||||||||||||||||
| Team: | ||||||||||||||||||||||||||
| Work Started: | ||||||||||||||||||||||||||
| Description |
|
It has been reported that using PostgreSQL and trying to publish any high amount of data/number of sites or pages will make the toast table increase in size up to occupying the whole available DB. This issue has been reported on PostgreSQL 11.9 when the user tried to publish 8000 sites and ended up filling 15GB of DB available size when trying to publish 120MB of data. You can find the used configuration parameters attached to this ticket. There are some workarounds that can be used, like performing an export/import of the data for a first heavy load publication instead of performing a normal publish, executing a vacuum of the toast table after the publish or directly increasing the DB size so it's not filled during the publication. |
| Comments |
| Comment by Jan Haderka [ 21/Feb/23 ] |
|
One possible clue to the problem might be that it wasn't happening roughly about 12months ago (+/-) so perhaps we can narrow down the issue to what has changed in that time. Something in the product or drivers or db or infra used to run it? Could it be adverse effect of some 3rd party lib or some seemingly innocuous code change? The effect is made more pronounced when working with either flat structures or high amounts of data. What is so special about it except the validation overhead and amount of memory consumed by the data and traffic generated by retrieving it from DB? When checking the internet for underlying causes of DB failing to reclaim space itself, it often comes down to long living transactions or other live long living connections to DB that keep claim on parts of the temp space hence preventing db from reclaiming block of garbage from memory as it would reclaim only space between last used block and end of the temp heap. What could such connection be? While first detected on db_datastore used as backing storage for binaries, it can be seen on other tables too (as mentioned above - those with big data or a lot of flat data). Is there anything else those have in common? ... just some food for thought. |
| Comment by David Martin [ 04/Jul/23 ] |
|
Hi Folks, This bug was closed, what was it's resolution? Was it fixed, is there a workround? Cheers, Dave |