With ever the constant updates to popular and important apps and websites, how do the developers ensure the update preserves the user and site’s original database, considering the possibility that the updates might contain data-corruptive bugs or exploits for users?

This question is not addressing the frequent database leaks, since they only involve release of read only data and exploits only exist in the form of accessing user accounts and data, not altering the site’s database at large itself.

Even if there is a backup for the database, undoing changes that affect millions or people negatively would create a ton of uproar, especially if the site is based on real time interactions, like stock broker apps, social media apps and instant messengers. However I have personally never heard of such incidents.

Do companies at such large scales have extreme QA in place, depend on user feedback and reporting, or just have been lucky they haven’t been exploited yet? Or am I completely wrong and these incidents do occur?

Keep in mind that I am an amateur in this domain. I have worked with MySQL databases for educational purposes and personal projects but I found the state of databases very fragile, like the ability to nuke the entire database with just 2-3 words. This fact made me come up with this question.

  • I would also say, that if you have not heard of exploits you have not been listening. Ransomware has been in the news all the time. All security is porous and shit happens too. This is why a lot of companies have gone crazy nuts about IT security and other processes. There is a good reason.

    I worked for a company that was migrating an enterprise system. They did the rollout by region, but at times regions could not ship product for a month or more. I saw from outside, but one has to wonder who thought that was acceptable or reasonable.

    Bottom line in IT if it can happen it will. If it has not been tested assume it does not work, etc. Assume the worst because there are so many more way things can go wrong then work correctly.

  • A combination of:

    1. dev/test environments
    2. Database backups
    3. Data warehouses/lakes
    4. Not writing bad code (parameterized queries every single time)
    5. Knowing how to write database upgrade scripts
    6. At larger companies, having a DBA team that keeps things running right
  • (Developer here)

    With hope, unmet deadlines, and false promises.

    Seriously, observing good practices, and testing.

    For example, your cenario of nuking a database with a couple of words “DROP SCHEMA…” can be easily prevented by access restrictions. The DB user that the application uses to access the database simply has no permissions to invoke such command. You can for example only allow access through stored procedures and views. Or no direct DB access at all; but only through (also limited) APIs exposed by the underlying system.

    So you’ll have a set of procedures and/or views, or higher level APIs (where for example you don’t directly INSERT a purchase order in an ERP database table, but craft a JSON or XML containing the order information, and send it via API to the ERP system, which validates it and -hopefully- ensures integrity).

    Layers of protection, restrictions, validations, etc.

    The API exposed by the ERP, or the procedure you made available to the backend simply can’t do destructive things.

    Most changes will be at the app/ui layer, and when you need to change the “lower” layers, you’re dealing with units much easier to verify and test.

    Think about a waiter you interact with in a restaurant. You order, get your food, and pay through them, but would you trust them all the food safety protocols from farming/fishing, handling, storing, cooking, etc, to your table? Or could they personally withdraw your money from a physical vault of your bank, and carry the money to the restaurant’s bank vault? No. Each layer in this highly complex transaction becomes more specialized, restricted, robust, regulated and secured. The user interface may be disposable, replaceable and inconsistent, but it has little access to what is critical.

    PS: of course even then, sometimes shifts happen.

  • I can’t speak for every company, but what I do is have a separate development database and web server. I make my changes and test them in development. If something breaks, no problem. I revert the changes and no users were impacted. When everything looks ready to deploy, you move it over to live.

    Of course, there are also QA and security testing steps in between development and live to ensure that your new code, while not showing visible bugs, isn’t opening security holes.

    As for the ability to nuke the entire database in 2-3 words, pretty much every developer has a story of the time when they accidentally wiped out data. In my case, I was working on an expense report application that I had written and was trying to delete some date. I wrote a query along the lines of “Delete From ExpenseReports where ID = 7.” Then I highlighted the query (since I had many queries in that SQL tab) and hit F5 to run it.

    Unfortunately, I realized the second that my finger hit the key that only “Delete From ExpenseReports” was highlighted. I had deleted ALL of the data. I spent the rest of the day recovering data and learned a valuable lesson about being too quick on running queries.

  • Well, first of, bugs do happen. And some company do not recover from a mishandled update/migration, other pay huge fine/compensation.

    But very big companies have several ways to mitigate such issues. In random order:

    Isolation: the database cannot be easily reached, especially from the outside and especially with the required privileges to do great harm. That’s a combination of rôle (both real world like developer and logical ones like Unix user), firewall, different servers running the service and the db. Also different databases for different services or customers. So you can start to do the upgrade for non paying customers for examples.

    Replication: a full copy of the database is usually kept for anything requiring high availability. All changes are synchronized. But on an upgrade, only one of the two is upgraded at a time.

    Backup: frequent full backup to a remote location (different data center)

    Test environment: a full copy of production (functionality wise) is usually available, sometimes to both developers and customers. Most changes are validated there before being pushed to production (that includes custom SQL queries for maintenance, but also any kind of upgrade, including hardware).

    Insurance: a way to stay in business when shit hit the fan and one of the reason OracleDB see so much use. Any fees caused by a bug can be handed over to the DB vendor. This comes at a huge recurring cost thought. And many small prints and asterisks.

    Recovery rehearsal: basically a way to test all the above (except maybe the insurance).

  • It depends. A typical upgrade to software doesn’t touch the database at all. The database doesn’t change if you don’t change it. Some updates will do migrations of databases to handle schema changes. There’s specific technologies that aid in this process such as liquibase. Updates that are especially potentially volatile will sometimes involve making backups of the database just in case. But any sane company will have nightly backups if not better with a couple fallback options in case of issues. Super critical systems will have logical replication, hidden nodes, or time travel (depending on technology) for more real-time recovery.

    Good database management will involve least access principals so nobody has access to those most dangerous commands and few people/systems have access to the merely quite dangerous commands.

    The bigger you are, the more process and protections you put in place (ideally) to decrease the chance of such incidents and more importantly, have a Playbook defining what to do to quickly recover when they do happen. There will end up being one to seversl classes of employees between devs and the critical prod databases in big companies. Companies will pay millions in wages and vendor contracts for systems to provide observability and protection that covers database related incidents. And those protections will also include audits and change control processes for any code that runs. Different deployment techniques and of course thorough testing in various preprod environments can uncover most issues as well.