Simple accord.net machine learning example

后端 未结 1 1900
孤独总比滥情好
孤独总比滥情好 2021-02-06 05:43

I’m new to machine learning and new to accord.net (I code C#).

I want to create a simple project where I look at a simple time series of data that oscillate

相关标签:
1条回答
  • 2021-02-06 06:19

    A simple way to do this would be to use an Accord ID3 decision tree.

    The trick is to work out what inputs to use - you can't just train on X - the tree won't learn anything about future values of X from that - however you can build some features derived from X (or previous values of Y) that will be useful.

    Normally for problems like this - you would make each prediction based on features derived from previous values of Y (the thing being predicted) rather than X. However that assumes you can observe Y sequentially between each prediction (you can't then predict for any arbitary X) so I'll stick with the question as presented.

    I had a go at building an Accord ID3 decision tree to solve this problem below. I used a few different values of x % n as the features - hoping the tree could work out the answer from this. In fact if I'd added (x-1) % 4 as a feature it could do it in a single level with just that attribute - but I guess the point is more to let the tree find the patterns.

    And here is the code for that :

        // this is the sequence y follows
        int[] ysequence = new int[] { 1, 2, 3, 2 };
    
        // this generates the correct Y for a given X
        int CalcY(int x) => ysequence[(x - 1) % 4];
    
        // this generates some inputs - just a few differnt mod of x
        int[] CalcInputs(int x) => new int[] { x % 2, x % 3, x % 4, x % 5, x % 6 };
    
    
        // for http://stackoverflow.com/questions/40573388/simple-accord-net-machine-learning-example
        [TestMethod]
        public void AccordID3TestStackOverFlowQuestion2()
        {
            // build the training data set
            int numtrainingcases = 12;
            int[][] inputs = new int[numtrainingcases][];
            int[] outputs = new int[numtrainingcases];
    
            Console.WriteLine("\t\t\t\t x \t y");
            for (int x = 1; x <= numtrainingcases; x++)
            {
                int y = CalcY(x);
                inputs[x-1] = CalcInputs(x);
                outputs[x-1] = y;
                Console.WriteLine("TrainingData \t " +x+"\t "+y);
            }
    
            // define how many values each input can have
            DecisionVariable[] attributes =
            {
                new DecisionVariable("Mod2",2),
                new DecisionVariable("Mod3",3),
                new DecisionVariable("Mod4",4),
                new DecisionVariable("Mod5",5),
                new DecisionVariable("Mod6",6)
            };
    
            // define how many outputs (+1 only because y doesn't use zero)
            int classCount = outputs.Max()+1;
    
            // create the tree
            DecisionTree tree = new DecisionTree(attributes, classCount);
    
            // Create a new instance of the ID3 algorithm
            ID3Learning id3learning = new ID3Learning(tree);
    
            // Learn the training instances! Populates the tree
            id3learning.Learn(inputs, outputs);
    
            Console.WriteLine();
            // now try to predict some cases that werent in the training data
            for (int x = numtrainingcases+1; x <= 2* numtrainingcases; x++)
            {
                int[] query = CalcInputs(x);
    
                int answer = tree.Decide(query); // makes the prediction
    
                Assert.AreEqual(CalcY(x), answer); // check the answer is what we expected - ie the tree got it right
                Console.WriteLine("Prediction \t\t " + x+"\t "+answer);
            }
        }
    

    This is the output it produces :

                     x   y
    TrainingData     1   1
    TrainingData     2   2
    TrainingData     3   3
    TrainingData     4   2
    TrainingData     5   1
    TrainingData     6   2
    TrainingData     7   3
    TrainingData     8   2
    TrainingData     9   1
    TrainingData     10  2
    TrainingData     11  3
    TrainingData     12  2
    
    Prediction       13  1
    Prediction       14  2
    Prediction       15  3
    Prediction       16  2
    Prediction       17  1
    Prediction       18  2
    Prediction       19  3
    Prediction       20  2
    Prediction       21  1
    Prediction       22  2
    Prediction       23  3
    Prediction       24  2
    

    Hope that helps.

    EDIT : Following comments, below the example is modified to train on previous values of the target (Y) - rather than features derived from the time index (X). This means you can't start training at the start of your series - as you need a back history of previous values of Y. In this example I started at x=9 just because that keeps the same sequence.

            // this is the sequence y follows
        int[] ysequence = new int[] { 1, 2, 3, 2 };
    
        // this generates the correct Y for a given X
        int CalcY(int x) => ysequence[(x - 1) % 4];
    
        // this generates some inputs - just a few differnt mod of x
        int[] CalcInputs(int x) => new int[] { CalcY(x-1), CalcY(x-2), CalcY(x-3), CalcY(x-4), CalcY(x - 5) };
        //int[] CalcInputs(int x) => new int[] { x % 2, x % 3, x % 4, x % 5, x % 6 };
    
    
        // for http://stackoverflow.com/questions/40573388/simple-accord-net-machine-learning-example
        [TestMethod]
        public void AccordID3TestTestStackOverFlowQuestion2()
        {
            // build the training data set
            int numtrainingcases = 12;
            int starttrainingat = 9;
            int[][] inputs = new int[numtrainingcases][];
            int[] outputs = new int[numtrainingcases];
    
            Console.WriteLine("\t\t\t\t x \t y");
            for (int x = starttrainingat; x < numtrainingcases + starttrainingat; x++)
            {
                int y = CalcY(x);
                inputs[x- starttrainingat] = CalcInputs(x);
                outputs[x- starttrainingat] = y;
                Console.WriteLine("TrainingData \t " +x+"\t "+y);
            }
    
            // define how many values each input can have
            DecisionVariable[] attributes =
            {
                new DecisionVariable("y-1",4),
                new DecisionVariable("y-2",4),
                new DecisionVariable("y-3",4),
                new DecisionVariable("y-4",4),
                new DecisionVariable("y-5",4)
            };
    
            // define how many outputs (+1 only because y doesn't use zero)
            int classCount = outputs.Max()+1;
    
            // create the tree
            DecisionTree tree = new DecisionTree(attributes, classCount);
    
            // Create a new instance of the ID3 algorithm
            ID3Learning id3learning = new ID3Learning(tree);
    
            // Learn the training instances! Populates the tree
            id3learning.Learn(inputs, outputs);
    
            Console.WriteLine();
            // now try to predict some cases that werent in the training data
            for (int x = starttrainingat+numtrainingcases; x <= starttrainingat + 2 * numtrainingcases; x++)
            {
                int[] query = CalcInputs(x);
    
                int answer = tree.Decide(query); // makes the prediction
    
                Assert.AreEqual(CalcY(x), answer); // check the answer is what we expected - ie the tree got it right
                Console.WriteLine("Prediction \t\t " + x+"\t "+answer);
            }
        }
    

    You could also consider training on the differences between previous values of Y - which would work better where the absolute value of Y is not as important as the relative change.

    0 讨论(0)
提交回复
热议问题