Best practices for data deletion on user account termination

后端 未结 4 1720
遥遥无期
遥遥无期 2021-02-02 01:29

On a site that has a fair share of user-generated content such as forum threads, blog comments, submitted articles, private and public messaging, user profiles, etc; what is the

相关标签:
4条回答
  • 2021-02-02 02:18

    Generally speaking with databases you rarely delete anything. You can mark it as deleted but generally speaking you keep it in your database at least for a time.

    There are many reasons for this. Some of them are legal. You may have requirements ot keep data for a given period. Some of them are technical. Sometimes its just a safeguard. You may need to restore the information. The user may request their account is reopened or it may have been locked due to spamming but that was because the account had been compromised and has now been restored.

    Old data may be deleted or archived but this may take months or even years.

    Personally I just give relevant data a status column (eg 1 = active, 0 = deleted) and then just change the status rather than delete it 99% of the time.

    Data integrity is another issue here. Let me give you an example.

    Assume you have two entities:

    User: id, nick, name, email
    Message: id, sender_id, receiver_id, subject, body
    

    You want to delete a particular User. What do you do about messages they've sent and received? Those messages will appear in someone else's inbox or sent items so you can't delete them. Do you set the relevant field in Message to NULL? That doesn't make a lot of sense either because that message did come from (or go to) somebody, even if they aren't active anymore.

    You're better off just marking that user as deleted and keeping them around. It makes this and similar situations much easier to deal with.

    You also mention forum threads and so on. You can't delete those either (unless there are other reasons to do so such as spam or abuse) because they're content that is related to other content (eg forum messages that have been replied to).

    The only data you can safely and reasonably delete is child data. This is really the difference between aggregation and composition. The User and message relationship above is aggregation. An example of composition is House and Room. You delete a House and all the rooms go to. Rooms cannot exist without a House. This is composition or, in entity relationship terms, a parent-child relationship.

    But you'll find more instances of aggregation than composition (in my experience) so the question becomes: what do you do with that data? It's really hard to erase all traces of someone without deleting things you shouldn't. Just mark them as deleted, locked or inactive and deal with it that way.

    0 讨论(0)
  • 2021-02-02 02:22

    I've been thinking about these same issues for quite some time. Honestly you shouldn't delete a thread started by a user-to-be-deleted if the other people have contributed their time and efforts to it. I remember on one forum there was a rule you can't delete your thread after somewhat 11 hours after it's been published. I guess the idea behind is that you can't take your word back after you've pronounced it.

    So, better lock account but don't cascade-delete anything in relation to user.

    Especially, so that they can delete their account, then register under the same name and start it all over again.

    0 讨论(0)
  • 2021-02-02 02:23

    You should keep all the content and just mark user as deleted so other users won't be able to see his or her profile, username etc. Then another user should be able to register by the same name (since it should become free).

    0 讨论(0)
  • 2021-02-02 02:26

    You could just mark the user as deleted and then whenever you display any content involving that user then you display the name as "Ex-User" or something.

    This protects the departed users identity without destroying your content.

    0 讨论(0)
提交回复
热议问题