I\'ve been reading a lot about Relational Databases using many JOIN statements on every SELECT. However, I\'ve been wondering if there\'s any performance problem on the long run
My advice on data modeling is:
More in Database Development Mistakes Made by AppDevelopers.
Now as for directness of a model, let me give you an example. Let's say you're designing a system for authentication and authorization of users. An overengineered solution might look something like this:
So you need 6 joins to get from the username entered to the actual privileges. Sure there might be an actual requirement for this but more often than not this kind of system is put in because of the hand-wringing by some developer thinking they might someday need it even though every user only has one alias, user to login is 1:1 and so on. A simpler solution is:
and, well, that's it. Perhaps if you need a complex role system but it's also quite possible that you don't and if you do it's reasonably easy to slot in (user type becomes a foreign key into a user types or roles table) or it's generally straightforward to map the old to the new.
This is thing about complexity: it's easy to add and hard to remove. Usually it's a constant vigil against unintended complexity, which is bad enough without going and making it worse by adding unnecessary complexity.
If the data is 1 <-> 1, and you will not have many null fields, dont over normalize. You can still specify the fields required ("most used data") in the select statements.
Fear not joining. The relational model is strong and you should employ it. Someone always discussed N+1, but also consider--in your context--joining against users often for security purposes too as the query can additionally mandate user existence, status, session correctness, and field expectation.
Many large sites go so far as to have session table and http request table for every request, always joined against each other for the page queries. Benefit is that parameters are always matched to sessions, sessions to proper users, user status always checked, &c &c but moreso that it allows for some interesting scale-out benefits.
Long story, do it wisely, but don't skimp on joining.
Some bright person once said:
Normalize until it hurts, denormalize until it works!
It all depends on the type of joins, and the join conditions, but there are nothing wrong with them. Joins ON table1.PK = table2.FK are very efficient.
As others have said - joins aren't a thing to avoid at all. In fact, in most models it is rare not to have a few joins in every single query that the application runs.
Even in the biggest queries they aren't usually a performance problems - and often fix performance problems that would occur if you have redundant and repeating data all over the place.
However, be aware that under the cover the database just joins two tables at a time. So, joins necessitate multiple steps by the database that are invisible to the developer. When it does these joins it has to make a few decisions about how to go about it:
So, if your joins are complex ultimately the efficiency will be driven by the sophistication of your optimizer/planner and the currency and detail of your statistics. MySQL isn't a strong contender here - so I'd generally keep my model and sql a little simpler than if I was using something else. But a few joins per query should almost always be fine.