With ever the constant updates to popular and important apps and websites, how do the developers ensure the update preserves the user and site’s original database, considering the possibility that the updates might contain data-corruptive bugs or exploits for users?

This question is not addressing the frequent database leaks, since they only involve release of read only data and exploits only exist in the form of accessing user accounts and data, not altering the site’s database at large itself.

Even if there is a backup for the database, undoing changes that affect millions or people negatively would create a ton of uproar, especially if the site is based on real time interactions, like stock broker apps, social media apps and instant messengers. However I have personally never heard of such incidents.

Do companies at such large scales have extreme QA in place, depend on user feedback and reporting, or just have been lucky they haven’t been exploited yet? Or am I completely wrong and these incidents do occur?

Keep in mind that I am an amateur in this domain. I have worked with MySQL databases for educational purposes and personal projects but I found the state of databases very fragile, like the ability to nuke the entire database with just 2-3 words. This fact made me come up with this question.

  • Well, first of, bugs do happen. And some company do not recover from a mishandled update/migration, other pay huge fine/compensation.

    But very big companies have several ways to mitigate such issues. In random order:

    Isolation: the database cannot be easily reached, especially from the outside and especially with the required privileges to do great harm. That’s a combination of rôle (both real world like developer and logical ones like Unix user), firewall, different servers running the service and the db. Also different databases for different services or customers. So you can start to do the upgrade for non paying customers for examples.

    Replication: a full copy of the database is usually kept for anything requiring high availability. All changes are synchronized. But on an upgrade, only one of the two is upgraded at a time.

    Backup: frequent full backup to a remote location (different data center)

    Test environment: a full copy of production (functionality wise) is usually available, sometimes to both developers and customers. Most changes are validated there before being pushed to production (that includes custom SQL queries for maintenance, but also any kind of upgrade, including hardware).

    Insurance: a way to stay in business when shit hit the fan and one of the reason OracleDB see so much use. Any fees caused by a bug can be handed over to the DB vendor. This comes at a huge recurring cost thought. And many small prints and asterisks.

    Recovery rehearsal: basically a way to test all the above (except maybe the insurance).