I have been mulling over config files and their relationship to code for a while now and depending on the day and direction of the wind my opinions seem to change. More and mor
I have seen python programs where the config file is code. If you don't need to do anything special (conditionals, etc.) it doesn't look much different from other config styles. e.g. I could make a file config.py
with stuff like:
num_threads = 13
hostname = 'myhost'
and the only burden on the user, compared with (say) INI files, is that they need to put '' around strings. No doubt you could do the same thing in other interpreted languages. It gives you unlimited ability to complicate your config file if necessary, at the risk of possibly scaring your users.
Config files invariably inch their way to becoming ugly, illogical "full fledged programming languages". It takes art and skill to design good programming languages, and config languages turned programming language tend to be horrendous.
A good approach is to use a nicely designed language, say python or ruby, and use it to create a DSL for your configuration. That way your configuration language can remain simple on the surface but in actually be the full fledged programming language.
Every (sufficiently-long-lived) config file schema eventually becomes a programming language. Due to all the implications you describe, it is wise for the config-file designer to realize she is authoring a programming language and plan accordingly, lest she burden future users with bad legacy.
I have a different philosophy about config files. Data about how an application should be run is still data, and therefore belongs in a data store, not in code (a config file IMO is code). If end users need to be able to change the data, then the application should provide an interface to do so.
I only use config files to point at data stores.
Ok. You will have some users which want a really simple config, you should give it to them. At the same time, you will have constant requests of "Can you add this? How do I do in the config file?", I don't see why you can't support both groups.
The project I am currently working on uses Lua for its configuration file. Lua is a scripting language, and it works quite well in this scenario. There is available an example of our default configuration.
You'll note that it is mainly key=value statements, where value can be any of Lua's built-in types. The most complicated thing there are lists, and they aren't really complicated (it's just a matter of syntax).
Now I'm just waiting for someone to ask how to set their server's port to a random value every time they start it up...
Very interesting questions!
I tend to limit my config files to a very simple "key=value" format, because I fully agree with you that config files can very quickly become full-blown programs. For example, anyone who has ever tried to "configure" OpenSER knows the feeling you are talking about: it's not configuration, it's (painful) programming.
When you need your application to be very "configurable" in ways that you cannot imagine today, then what you really need is a plugins system. You need to develop your application in a way that someone else can code a new plugin and hook it into your application in the future.
So, to answer your questions:
What is the true purpose of a config file?
I would say, to allow the people who will install your application to be able to tweek some deployment-related parameters, such as host name, number of threads, names of the plugins you need, and the deployment-parameters for those plugins (check out FreeRadius's configuration for an example of this principle), etc.. Definitely not the place to express business logic.
Should an attempt be made to keep config files simple?
Definitely. As you suggested, "programming" in a config file is horrible. I believe it should be avoided.
Who should be responsible for making changes to them (developers, users, admins, etc.)?
In general, I would say admins, who deploy the application.
Should they be source controlled (see question 3)?
I usually don't source-control the configuration files themselves, but I do source-control a template configuration file, with all the parameters and their default values, and comments describing what they do. For example, if a configuration file is named database.conf
, I usually source-control a file named database.conf.template
. Now of course I am talking about what I do as a developer. As an admin, I may want to source-control the actual settings that I chose for each installation. For example, we manage a few hundred servers remotely, and we need to keep track of their configurations: we chose to do this with source-control.
Edit: Although I believe the above to be true for most applications, there are always exceptions, of course. Your application may allow its users to dynamically configure complex rules, for example. Most email clients allow the users to define rules for the management of their emails (for example, "all emails coming from 'john doe' and not having me in the To: field should be discarded"). Another example is an application that allows the user to define a new complex commercial offer. You may also think about applications like Cognos which allow their users to build complex database reports. The email client will probably offer the user a simple interface to define the rules, and this will generate a complex configuration file (or even perhaps a bit of code). On the other hand, the user-defined configuration for the commercial offers might be saved in a database, in a structured way (neither a simple key=value structure nor a portion of code). And some other applications might even allow the user to code in python or VB, or some other automation-capable language. In other words... your mileage may vary.