I use factors somewhat infrequently and generally find them comprehensible, but I often am fuzzy about the details for specific operations. Currently, I am coding/collapsing cat
I think the easiest way is to relabel all the naics not in the top 8 to a special value.
data$naics[!(data$naics %in% top8)] = -99
Then you can use the "exclude" option when turning it into a factor
factor(data$naics, exclude=-99)