AWS database single column adds extremely much data









up vote
1
down vote

favorite












I'm retrieving data from an AWS database using PgAdmin. This works well. The problem is that I have one column that I set to True after I retrieve the corresponding row, where originally it is set to Null. Doing so adds an enormous amount of data to my database.



I have checked that this is not due to other processes: it only happens when my program is running.
I am certain no rows are being added, I have checked the number of rows before and after and they're the same.



Furthermore, it only does this when changing specific tables, when I update other tables in the same database with the same process, the database size stays the same. It also does not always increase the database size, only once every couple changes does the total size increase.



How can changing a single boolean from Null to True add 0.1 MB to my database?



I'm using the following commands to check my database makeup:



To get table sizes



SELECT
relname as Table,
pg_total_relation_size(relid) As Size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as External Size
FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;


To get number of rows:



SELECT schemaname,relname,n_live_tup 
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;


To get database size:



SELECT pg_database_size('mydatabasename')









share|improve this question

















This question has an open bounty worth +150
reputation from Nathan ending ending at 2018-11-18 15:32:14Z">in 3 days.


This question has not received enough attention.


I would like to know how to fix the problem of my database increasing in size. It would also be nice to understand how come this happened.















  • Null is no data. True is a byte (8 bits). Plus one column, How many rows?
    – Alexander Ingham
    Nov 9 at 15:37










  • @AlexanderIngham I already added the column, now I'm just changing the values in the column from Null to True, one by one. It's about 20 million rows, but after retrieving 100.000 of those I already have over a GB of extra data.
    – Nathan
    Nov 9 at 15:41















up vote
1
down vote

favorite












I'm retrieving data from an AWS database using PgAdmin. This works well. The problem is that I have one column that I set to True after I retrieve the corresponding row, where originally it is set to Null. Doing so adds an enormous amount of data to my database.



I have checked that this is not due to other processes: it only happens when my program is running.
I am certain no rows are being added, I have checked the number of rows before and after and they're the same.



Furthermore, it only does this when changing specific tables, when I update other tables in the same database with the same process, the database size stays the same. It also does not always increase the database size, only once every couple changes does the total size increase.



How can changing a single boolean from Null to True add 0.1 MB to my database?



I'm using the following commands to check my database makeup:



To get table sizes



SELECT
relname as Table,
pg_total_relation_size(relid) As Size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as External Size
FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;


To get number of rows:



SELECT schemaname,relname,n_live_tup 
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;


To get database size:



SELECT pg_database_size('mydatabasename')









share|improve this question

















This question has an open bounty worth +150
reputation from Nathan ending ending at 2018-11-18 15:32:14Z">in 3 days.


This question has not received enough attention.


I would like to know how to fix the problem of my database increasing in size. It would also be nice to understand how come this happened.















  • Null is no data. True is a byte (8 bits). Plus one column, How many rows?
    – Alexander Ingham
    Nov 9 at 15:37










  • @AlexanderIngham I already added the column, now I'm just changing the values in the column from Null to True, one by one. It's about 20 million rows, but after retrieving 100.000 of those I already have over a GB of extra data.
    – Nathan
    Nov 9 at 15:41













up vote
1
down vote

favorite









up vote
1
down vote

favorite











I'm retrieving data from an AWS database using PgAdmin. This works well. The problem is that I have one column that I set to True after I retrieve the corresponding row, where originally it is set to Null. Doing so adds an enormous amount of data to my database.



I have checked that this is not due to other processes: it only happens when my program is running.
I am certain no rows are being added, I have checked the number of rows before and after and they're the same.



Furthermore, it only does this when changing specific tables, when I update other tables in the same database with the same process, the database size stays the same. It also does not always increase the database size, only once every couple changes does the total size increase.



How can changing a single boolean from Null to True add 0.1 MB to my database?



I'm using the following commands to check my database makeup:



To get table sizes



SELECT
relname as Table,
pg_total_relation_size(relid) As Size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as External Size
FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;


To get number of rows:



SELECT schemaname,relname,n_live_tup 
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;


To get database size:



SELECT pg_database_size('mydatabasename')









share|improve this question















I'm retrieving data from an AWS database using PgAdmin. This works well. The problem is that I have one column that I set to True after I retrieve the corresponding row, where originally it is set to Null. Doing so adds an enormous amount of data to my database.



I have checked that this is not due to other processes: it only happens when my program is running.
I am certain no rows are being added, I have checked the number of rows before and after and they're the same.



Furthermore, it only does this when changing specific tables, when I update other tables in the same database with the same process, the database size stays the same. It also does not always increase the database size, only once every couple changes does the total size increase.



How can changing a single boolean from Null to True add 0.1 MB to my database?



I'm using the following commands to check my database makeup:



To get table sizes



SELECT
relname as Table,
pg_total_relation_size(relid) As Size,
pg_size_pretty(pg_total_relation_size(relid) - pg_relation_size(relid)) as External Size
FROM pg_catalog.pg_statio_user_tables ORDER BY pg_total_relation_size(relid) DESC;


To get number of rows:



SELECT schemaname,relname,n_live_tup 
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;


To get database size:



SELECT pg_database_size('mydatabasename')






amazon-web-services pgadmin






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 at 16:48

























asked Nov 9 at 15:26









Nathan

1,115517




1,115517






This question has an open bounty worth +150
reputation from Nathan ending ending at 2018-11-18 15:32:14Z">in 3 days.


This question has not received enough attention.


I would like to know how to fix the problem of my database increasing in size. It would also be nice to understand how come this happened.








This question has an open bounty worth +150
reputation from Nathan ending ending at 2018-11-18 15:32:14Z">in 3 days.


This question has not received enough attention.


I would like to know how to fix the problem of my database increasing in size. It would also be nice to understand how come this happened.













  • Null is no data. True is a byte (8 bits). Plus one column, How many rows?
    – Alexander Ingham
    Nov 9 at 15:37










  • @AlexanderIngham I already added the column, now I'm just changing the values in the column from Null to True, one by one. It's about 20 million rows, but after retrieving 100.000 of those I already have over a GB of extra data.
    – Nathan
    Nov 9 at 15:41

















  • Null is no data. True is a byte (8 bits). Plus one column, How many rows?
    – Alexander Ingham
    Nov 9 at 15:37










  • @AlexanderIngham I already added the column, now I'm just changing the values in the column from Null to True, one by one. It's about 20 million rows, but after retrieving 100.000 of those I already have over a GB of extra data.
    – Nathan
    Nov 9 at 15:41
















Null is no data. True is a byte (8 bits). Plus one column, How many rows?
– Alexander Ingham
Nov 9 at 15:37




Null is no data. True is a byte (8 bits). Plus one column, How many rows?
– Alexander Ingham
Nov 9 at 15:37












@AlexanderIngham I already added the column, now I'm just changing the values in the column from Null to True, one by one. It's about 20 million rows, but after retrieving 100.000 of those I already have over a GB of extra data.
– Nathan
Nov 9 at 15:41





@AlexanderIngham I already added the column, now I'm just changing the values in the column from Null to True, one by one. It's about 20 million rows, but after retrieving 100.000 of those I already have over a GB of extra data.
– Nathan
Nov 9 at 15:41













2 Answers
2






active

oldest

votes

















up vote
0
down vote













What do you mean adds data? to all the data files? specifically to some files?



to get a precise answer you should supply more details, but generally speaking, any DB operation will add data to the transaction logs, and possibly other files.






share|improve this answer




















  • Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
    – Nathan
    Nov 11 at 16:43

















up vote
0
down vote













The problem is that As described here




"In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table"




Furthermore, we did not always close the cursor which also increased database size while running.



The current solution is to do a full vacuum every once in a while.



The problem still persists less strongly






share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228612%2faws-database-single-column-adds-extremely-much-data%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    What do you mean adds data? to all the data files? specifically to some files?



    to get a precise answer you should supply more details, but generally speaking, any DB operation will add data to the transaction logs, and possibly other files.






    share|improve this answer




















    • Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
      – Nathan
      Nov 11 at 16:43














    up vote
    0
    down vote













    What do you mean adds data? to all the data files? specifically to some files?



    to get a precise answer you should supply more details, but generally speaking, any DB operation will add data to the transaction logs, and possibly other files.






    share|improve this answer




















    • Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
      – Nathan
      Nov 11 at 16:43












    up vote
    0
    down vote










    up vote
    0
    down vote









    What do you mean adds data? to all the data files? specifically to some files?



    to get a precise answer you should supply more details, but generally speaking, any DB operation will add data to the transaction logs, and possibly other files.






    share|improve this answer












    What do you mean adds data? to all the data files? specifically to some files?



    to get a precise answer you should supply more details, but generally speaking, any DB operation will add data to the transaction logs, and possibly other files.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 11 at 15:48









    Ophir Yoktan

    4,26122970




    4,26122970











    • Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
      – Nathan
      Nov 11 at 16:43
















    • Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
      – Nathan
      Nov 11 at 16:43















    Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
    – Nathan
    Nov 11 at 16:43




    Hi Ophir, what I mean is that the total database size increases. Strangely enough this only happens when I change specific tables. I understand it always adds some data, but it currently adds such an exorbitant amount of space that something has to be wrong.
    – Nathan
    Nov 11 at 16:43












    up vote
    0
    down vote













    The problem is that As described here




    "In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table"




    Furthermore, we did not always close the cursor which also increased database size while running.



    The current solution is to do a full vacuum every once in a while.



    The problem still persists less strongly






    share|improve this answer


























      up vote
      0
      down vote













      The problem is that As described here




      "In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table"




      Furthermore, we did not always close the cursor which also increased database size while running.



      The current solution is to do a full vacuum every once in a while.



      The problem still persists less strongly






      share|improve this answer
























        up vote
        0
        down vote










        up vote
        0
        down vote









        The problem is that As described here




        "In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table"




        Furthermore, we did not always close the cursor which also increased database size while running.



        The current solution is to do a full vacuum every once in a while.



        The problem still persists less strongly






        share|improve this answer














        The problem is that As described here




        "In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table"




        Furthermore, we did not always close the cursor which also increased database size while running.



        The current solution is to do a full vacuum every once in a while.



        The problem still persists less strongly







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 6 hours ago

























        answered 13 hours ago









        Nathan

        1,115517




        1,115517



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228612%2faws-database-single-column-adds-extremely-much-data%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Kleinkühnau

            Makov (Slowakei)

            Deutsches Schauspielhaus