I need to return a distinct list of records based on a car keywords search like: \"Alfa 147\"
The problem is that, as I have 3 \"Alfa\" cars, it returns 1 + 3 records (i
Maybe this is close? At least the subqueries open it up a little for you to work with.
var query =
from c in categoryQuery
let keywords =
(
from k in keywordQuery where splitKeywords.Contains(k.Name)
join kac in keywordAdCategoryQuery on k.Id equals kac.Keyword_Id
where kac.Category_Id == c.Id
join a in adQuery on kac.Ad_Id equals a.Id
select k.Id
).Distinct()
where keywords.Any()
select new CategoryListByKeywordsDetailDto
{
Id = c.Id,
Name = c.Name,
SearchCount =
(
from kac in keywordAdCategoryQuery
where kac.Category_Id == c.Id
join kId in keywords on kac.Keyword_Id equals kId
select kac.Id
).Distinct().Count(),
ListController = c.ListController,
ListAction = c.ListAction
};
One of the beautiful features of linq is that you can build up complicated queries in smaller and simpler steps and let linq figure out how to join them all together.
The following is one way to get this information. I'm not sure whether this is the best and you would need to check it performs well when multiple keywords are selected.
Assuming keywords is defined something like
var keywords = "Alfa 147";
var splitKeywords = keywords.Split(new char[] {' '});
Stage 1
Get a list of keywords grouped by Ad and Category and
var subQuery = (from kac in keywordAdCategoryQuery
join k in keywordQuery on kac.Keyword_Id equals k.Id
select new
{
kac.Ad_Id,
kac.Category_Id,
KeyWord = k.Name,
});
var grouped = (from r in subQuery
group r by new { r.Ad_Id, r.Category_Id} into results
select new
{
results.Key.Ad_Id ,
results.Key.Category_Id ,
keywords = (from r in results select r.KeyWord)
});
Note, the classes you posted would suggest that your database does not have foreign key relationships defined between the tables. If they did then this stage would be slightly simpler to write.
Stage 2
Filter out any groups that do not have each of the keywords
foreach(var keyword in splitKeywords)
{
var copyOfKeyword = keyword ; // Take copy of keyword to avoid closing over loop
grouped = (from r in grouped where r.keywords.Contains(copyOfKeyword) select r) ;
}
Stage 3
Group by Category and count the results per category
var groupedByCategories = (from r in grouped
group r by r.Category_Id into results
join c in categoryQuery on results.Key equals c.Id
select new
{
c.Id ,
c.Name ,
Count = results.Count()
});
Stage 4
Now retrieve the information from sql. This should be done all in one query.
var finalResults = groupedByCategories.ToList();
Split the query string into an array and iterate through querying the database for each keyword and joining the result sets using unions. The resultant set will be every distinct record that matches any of the given keywords.
You are doing a select distinct on a list of CategoryListByKeywordsDetailDto. Distinct only works on POCO and anonymous objects. In your case you need to implement the IEqualitycomparer for select distinct to work.
I tried this using LINQ directly against in memory collections (as in, not through SQL) - seems to work for me (I think the main point being that you want to search for Ads that apply to ALL the keywords specified, not ANY, correct? Anyway, some sample code below (a little comment-ish and not necessarily the most efficient, but hopefully illustrates the point...)
Working with the following "data sets":
private List<AdCar> AdCars = new List<AdCar>();
private List<KeywordAdCategory> KeywordAdCategories = new List<KeywordAdCategory>();
private List<Category> Categories = new List<Category>();
private List<Keyword> Keywords = new List<Keyword>();
which are populated in a test method using the data you provided...
Search method looks a little like this:
var splitKeywords = keywords.Split(' ');
var validKeywords = Keywords.Join(splitKeywords, kwd => kwd.Name.ToLower(), spl => spl.ToLower(), (kwd, spl) => kwd.Id).ToList();
var groupedAdIds = KeywordAdCategories
.GroupBy(kac => kac.Ad_Id)
.Where(grp => validKeywords.Except(grp.Select(kac => kac.Keyword_Id)).Any() == false)
.Select(grp => grp.Key)
.ToList();
var foundKacs = KeywordAdCategories
.Where(kac => groupedAdIds.Contains(kac.Ad_Id))
.GroupBy(kbc => kbc.Category_Id, kac => kac.Ad_Id);
//Results count by category
var catCounts = Categories
.Join(foundKacs, cat => cat.Id, kacGrp => kacGrp.Key, (cat, kacGrp) => new { CategoryName = cat.Name, AdCount = kacGrp.Distinct().Count() })
.ToList();
//Actual results set
var ads = AdCars.Join(groupedAdIds, ad => ad.Id, grpAdId => grpAdId, (ad, grpAdId) => ad);
As I said, this is more to illustrate, please don't look too closely at the use of Joins & GroupBy etc (not sure its exactly, er, "optimal")
So, using the above, if I search for "Alfa", I get 3 Ad results, and if I search for "Alfa 147" I get just 1 result.
EDIT: I've changed the code to represent two possible outcomes (as I wasn't sure which was needed by your question)
ads
will give you the actual Ads returned by the search
catCounts
will give a list of anonymous types each representing the find results as a count of Ads by category
Does this help?
Should be possible to query for each keyword then union the result sets. The duplicate values will be removed from the union and you can work out the required aggregations.