Calculate mean value of sets of 4 sub locations from multiple location from a larger matrix

a 夏天 提交于 2019-12-12 03:54:55

问题


I am doing a data analysis on wall thickness measurements of circular tubes. I have the following matrix:

> head(datIn, 12)

        Component Tube.number Measurement.location Sub.location Interval  Unit      Start
1         In           1                    1            A      121      U6100  7/25/2000
2         In           1                    1            A      122      U6100  5/24/2001
3         In           1                    1            A      222      U6200  1/19/2001
4         In           1                    1            A      321      U6300   6/1/2000
5         In           1                    1            A      223      U6200  5/22/2002
6         In           1                    1            A      323      U6300  6/18/2002
7         In           1                    1            A       21      U6200  10/1/1997
8         In           1                    1            A      221      U6200   6/3/2000
9         In           1                    1            A      322      U6300 12/11/2000
10        In           1                    1            B      122      U6100  5/24/2001
11        In           1                    1            B      322      U6300 12/11/2000
12        In           1                    1            B       21      U6200  10/1/1997

        End Measurement Material.loss Material.loss.interval Run.hours.interval
1  5/11/2001         7.6           0.4                     NA            6653.10
2   2/7/2004         6.1           1.9                    1.5           15484.82
3   3/7/2002         8.5          -0.5                   -0.5            8826.50
4  12/1/2000         7.8           0.2                    0.2            4170.15
5  4/30/2003         7.4           0.6                    1.1            6879.73
6  9/30/2003         7.9           0.1                   -0.1            9711.56
7  4/20/2000         7.6           0.4                     NA           15159.94
8   1/5/2001         8.0           0.0                   -0.4            4728.88
9  5/30/2002         7.8           0.2                    0.0            9829.75
10  2/7/2004         5.9           2.1                    0.9           15484.82
11 5/30/2002         7.0           1.0                    0.7            9829.75
12 4/20/2000         8.2          -0.2                     NA           15159.94

 Run.hours.prior.to.interval Total.run.hours.end.interval
1                         0.00                      6653.10
2                      6653.10                     22137.92
3                     19888.82                     28715.32
4                         0.00                      4170.15
5                     28715.32                     35595.05
6                     30039.58                     39751.14
7                         0.00                     15159.94
8                     15159.94                     19888.82
9                     20209.83                     30039.58
10                     6653.10                     22137.92
11                    20209.83                     30039.58
12                        0.00                     15159.94

Straight.or.In.Out.Middle.bend.1 Straight.or.In.Out.Middle.bend.2
1                               Out                              Out
2                               Out                              Out
3                               Out                              Out
4                               Out                              Out
5                               Out                              Out
6                               Out                              Out
7                               Out                              Out
8                               Out                              Out
9                               Out                              Out
10                           Middle                              Out
11                           Middle                              Out
12                           Middle                              Out

The Sub.location column has values A, B, C, D. They are measurements at the same measurement location but at a different position in the cross section. So at 0, 90, 180, 270 degrees along the tube.

I would like to make a plot in which it becomes clear which measurement location has the biggest wall thickness decrease in time.

To do this I first want to calculate the mean value of the wall thickness of a tube at each measurement location at each unique interval (the running hours are coupled to the interval).

I tried doing this with the following formula:

par(mfrow=c(1,2))
myfunction <- function(mydata1) { return(mean(mydata1,na.rm=TRUE))}
AVmeasloc <- tapply(datIn$Measurement,list(as.factor(datIn$Sub.location),as.factor(datIn$Measurement.location), myfunction))
AVmeasloc

This doesnt seem to work. I would like to keep the tapply function as I also calculated the standard deviation for some values with this and it lets me make plots easily.

Does anyone have any advice how to tackle this problem?


回答1:


From the code you've post, there is a parenthesis error around list(), it should read

AVmeasloc <-  tapply(datIn$Measurement,list(as.factor(datIn$Sub.location),as.factor(datIn$Measurement.location)), myfunction)

This can now be cleaned up to

AVmeasloc <- tapply(datIn$Measurement,datIn[,c(3,4)],mean,na.rm=TRUE)

Here's a working example:

test.data <- data.frame(cat1 = c("A","A","A","B","B","B","C","C","D"),
                    cat2 = c(1,1,2,2,1,NA,2,1,1),
                    val = c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9))


tapply(test.data$val, test.data[,c(1,2)],mean,na.rm=TRUE)

    cat2
cat1    1   2
   A 0.15 0.3
   B 0.50 0.4
   C 0.80 0.7
   D 0.90  NA


来源:https://stackoverflow.com/questions/19680532/calculate-mean-value-of-sets-of-4-sub-locations-from-multiple-location-from-a-la

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!