In a brand new program where space isn\'t really that big a deal, is it better to delete a row or to disable a row by let\'s say a boolean \"Disabled\" and have the program
You need to have it in functional requirements. If it is not said there explicitly you will have to figure out it yourself.
In most cases it is better to store such records in separate table. You then avoid various situations where one table refers another table and you need to decide should records in second table be treated as deleted as well or not.
If you will need the deleted data sometimes, but not very often: you can move the records into a separate database/table (e.g. users
and users_deleted
, or better somedb.users
and somedb_deleted.users
).
This way, the data is still accessible through a query (although it won't be as simple as the normal one), yet it doesn't clutter the original database and you don't have to code around it.
This should be determined by the application needs. I have done it both ways. I have some applications that need to support undo as the cost of removing a row -- and the cascading deletes that are caused by that -- are too expensive to not have it. Normally, though, the applications I have done require the user to confirm deletes, then just do as the user has asked. In some cases, you must delete the data due to privacy concerns. That is, if the user requests to be removed, you need to really remove it, not just mark it as not current. In other cases (like tax-related transactions), there may be reasons to keep data in a non-current state until no longer required by law. I have applications that fit in both categories.
Various strategies can be used in the case where you need to keep "archival" data. Depending on whether it needs to be immediately available you can push it to archive tables that are either kept or backed up and cleaned out regularly. If there is a need for undo you may want to keep it in the current table and just mark it by setting a flag. It really depends on the complexity of your schema, the requirements of the application, and personal preference to some extent.
It depends on the function of the database. Is it the source of all truth? If yes, then disable rather than delete, as it is easier to recover from bad operations (ie user error). If the database is feed from some upstream data source, delete then unused data. Any recreation/recovery can be done by the upstream system.
After reading a book on temporal database design, I came to believe in the philosophy that every record of temporal significance needs to have at least 4 timestamp columns. Those four are: created, deleted, start, end. The created and deleted timestamps are fairly self-explanatory. Your system shouldn't look at records where deleted is before now(). The start and end columns determine when the data applies to your system. It's for keeping a history of changes. If you need to update a record, you'd set it's end time to now(), copy it, update the copy, and set the copy's start time to now(). That way, when you need to look at the way something was historically, you can have the system figure it out. You could also set the start to some point in the future to have a change take place automatically at that time, or set the end to a future time to have it automatically go away at that time. Setting the created/deleted timestamps to the future doesn't really make sense...
Adding a "DELETED" column to your table and marking rows instead of deleting them creates a lot more work for you with little (if any) benefit. Now, every time you write a query you have to remember to include "WHERE DELETED IS NOT NULL" (or whatever).
A better approach is to delete data when you need to delete data, and rely on your regular backup process to ensure that no data is ever lost. If for some reason you need to keep some deleted data handy (for searches, maybe), you're better off just copying the data to a different table created for this purpose and then deleting the originals.
I've inherited many databases over the years, and this strategy of flagging records instead of deleting them is unfortunately very common, and (in my experience at least) always leads to major problems down the road.