What is your intended data source and how big is it? If you can define the format of your data source, I would suggest having a text file in which each line has three or four fields, separated by some sort of delimiter. The fields would be a question, the correct answer, and either a list of characters indicating the categories into which the question and answer both belong, or a list of categories for the question and another for the answer.
To clarify the last point, in many multiple-choice tests, if one were to simply choose ten questions at random from a pool of 25, and then for each question print three random answers from the pool along with the correct answer, one may end up with a question like: "How many sides does a triangle have? (a) square (b) Euclid (c) three (d) rhombus". A COMPUTE! magazine article some decades back offered a multiple-choice quiz generator which solved this problem by what it called "discrimination"--attaching categories to questions and answers, and for each question only picking answers that were suitable for the question's category. I don't remember how that article did things, but would suggest for simplicity of coding and data entry that you identify categories of questions and answers, and pick a letter for each. For the above question, a reasonable category might be "written-out whole numbers less than thirteen", so if one arbitrarily decides to use the character "Q" for that, both the question and answer would have a category of "Q". In many cases, a single category for question and answer would suffice (I think that's how the COMPUTE! program worked, but in some cases one may need to allow something more sophisticated (e.g. for "A shape with four sides, and with pairs of opposite sides equal, is:", it may be reasonable to offer up "pentagon" as an option, but probably not "square", "rectangle", or "rhombus").
There are a few more issues to consider in the design of the data set, such as how it should handle the possibility that multiple questions may have the same answer, and whether answers should be listed in random order or consistent order (e.g. for "How many sides does a pentagon have", it may be nicer to list the answers as "(a) three (b) five (c) six (d) eight" than as "(a) eight (b) five (c) six (d) three").