Relational databases - there has to be more right?

前端 未结 19 1729
没有蜡笔的小新
没有蜡笔的小新 2021-01-31 00:17

I really enjoy database design and the whole concept of managing data semantically and all the logic that comes with it.

My knowledge level when it comes to databases is

相关标签:
19条回答
  • 2021-01-31 00:54

    To my mind there are three "tracks" with database skills: Developer, DBA and Architect. From a development perspective you want to focus on development, understand Architect and pick up as much DBA stuff as you need along the way.

    As a developer the key thing (to my mind) would be to get your SQL to a really good standard. As an interviewer if I'm looking for a developer I don't care if you can design databases as much as how you can write queries. Assuming you know about your basic CRUD commands, do you know about:

    Stored Procedures (not just how to use them but when and why)
    Views (ditto, including materialised views)
    Triggers (insert, update, delete, how and why)
    Cursors (especially impact on performance)
    Referential integrity
    Transactions
    Indexes
    Adding defaults, constraints and identities to tables
    Complex use of group by and having
    Functions especially:
    - Date and time manipulation
    - String manipulation
    - Handling nulls

    You should be able to pull any data you need from your database using SQL alone, you should never need to manipulate or parse it in any way using your procedural code (you may choose to but it will be a choice rather than you didn't know how to do it with SQL).

    As a developer the one booking I'd look at is Joe Celko's SQL for Smarties. Lots of SQL to do things you may never have really thought about being able to do in SQL.

    One of the best ways to learn this stuff is, tedious as it seems, writing reports (management information). I've seen so many people moan about writing reports being tedious and then do it really, really badly (and not just because they didn't try). Reports tend to be close to pure SQL so you have to really get to know the tools at hand and a complex report really exposes those who know SQL from those who don't. People also tend not to want to wait too long for them so performance is key too.

    Look at your current database and come up with a bunch of really really awkward things someone might actually want to know. Think marketing, trends, most and least popular. Then try and combine a bunch of them into one query.

    In terms of performance I'd also be trying to get inside how the query optimizer works, how it makes decisions about when to use an index and when to table scan, when indexes will help and when they'll hinder.

    A good developer doesn't just write good queries, they write quick, maintainable queries. To really get to grips with this you'll need to play around with a database with a dozen (or more tables) containing, ideally, millions of rows. That's when you start seeing queries you thought were fine dragging their heels.

    The architect/designer stuff others have covered pretty well. All I'd say on the subject is that for every database that has to be designed there are hundreds of queries that need to be written for it. You might want to consider that proportional break down of work when you're upskilling and make sure your querying is really up to scratch.

    In terms of links it depends on the platform - all this stuff tends to be platform specific. But then that's what google is for.

    Not I suspect entirely what you want but worth knowing as a lot of people who think they know SQL really really don't...

    0 讨论(0)
  • 2021-01-31 00:55

    Well it's always good designing examples... See if there's anyone you know who needs a database for something. But studying VLDB (Very Large DataBase) techniques might be useful depending on the industry you're interested in.

    0 讨论(0)
  • 2021-01-31 00:55

    Disclaimer: not an expert in database design.

    Some of the performance issues can be handled either by:

    1. denormalizing your database, so to reduce the number of tables to join
    2. adding indexes
    3. filtering should be done so that you first remove the largest of the non matching data, then you cherry pick the next condition on the reduced set. It's better to go from 100 values -> 10 matching first condition -> 1 matching first and second condition than 100 values -> 80 matching second condition -> 1 matching first and second condition. Seems trivial, but it's important to keep in mind.
    4. divide et impera is the motto for scalability. If you have something that scales in a non-linear way, say O(N^2) it makes sense to keep N as low as you can, and you should partition your data set into smaller sets, assuming they are independent and you can work out the partitioning. An example of this is sharding, typically used to keep databases of users in large social websites. (NB: an example, I would not implement it this way) Instead of having a huge database with all the users, they have 26 servers (one for each letter of the alphabet), then they put all the nicknames with the same first letter in the same server. This has the following advantages:

      a. you balance the load on different machines
      b. if one machine crashes, you make the site unaccessible only to a subset of your users, not to all of them
      c. you preselect the search with a highly discriminating criterium (the first letter), then perform the second search (the username)
      d. you reduce the number of entries each database has.

    0 讨论(0)
  • 2021-01-31 00:55

    I highly recommend to start with www.dbdebunk.com. It has a lot of practical stuff in oppose to theory. The site is a little outdated, but still useful. Even commercial content isn't too expensive, if you really like to become database professional.

    0 讨论(0)
  • 2021-01-31 00:58

    I'll volunteer a list of areas that you might want to consider as aspects of programming with databases. I would not claim that you need to be expert at all of them, or even most of them, in order to be able to program using a DBMS, nor even to program a DBMS. However, they are all topics that are of some relevance at some times - in no particular order:

    • Query language design
    • Query optimization
    • Query rewriting
    • Data types
    • Storage organization
    • Transaction management
    • Communications protocols
    • Encryption
    • Authentication and identification
    • Schema design
    • Replication
    • Backup and restore
    • Two-phase commits
    • Optimistic concurrency control
    • Locking and pessimistic concurrency control
    • Authorization
    • Label-based access control
    • Set theory
    • Relational theory
    • Distributed query
    • Boolean logic
    • User-defined types and functions
    • Catalog management
    • Buffer management
    • Sorting
    • Internationalization (I18N), Localization (L10N), Globalization (G11N)
    • Quantifiers
    • Auditing
    • Triggers
    • Stored procedures

    I make no claims of completeness or minimality, either.

    0 讨论(0)
  • 2021-01-31 00:58

    The standard text in the field is "An Introduction to Database Systems", by C. J. Date.

    I have twenty years C experience; I read it, thought it excellent and I wrote a relational database because of it (a proper one, not this SQL malarky!).

    0 讨论(0)
提交回复
热议问题