问题
I have two entities: Projects and Student Lists. One Project can have many Student lists.
I am attempting to join student list on project and only return the first row for each project based on custom ordering of student lists.
Attempted Subquery:
_whens = {ProjectStatus.APPROVED: 1, ProjectStatus.REJECTED: 2,
ProjectStatus.SUBMITTED: 3, None: 4}
sort_order = case(value=StudentList.student_list_status_id, whens=_whens)
return self._session.query(StudentList).
filter(StudentList.student_list_id==Project.project_id)
.order_by(sort_order).limit(1).subquery()
Above I define the custom ordering based on student list status id. The function returns the subquery which I then attempt to join to my Project outerquery below (student_list_subquery refers to what is returned above):
projects = self._session.query(models.Project)
.filter(models.Project.project_year == year)
.join(student_list_subquery,
student_list_subquery.c.project_id==Project.project_id)
.all()
Below is the relevant SQL output
FROM project
LEFT OUTER JOIN (SELECT student_list.project_id AS project_id,
student_list.student_list_id AS student_list_id
FROM student_list, project
WHERE project.project_id = student_list.project_id
ORDER BY CASE student_list.student_list_status_id WHEN 102 THEN 1
WHEN 105 THEN 2 WHEN 101 THEN 3 WHEN NULL THEN 4 END
LIMIT 1) AS anon_1 ON anon_1.project_id = project.project_id
I am using mySQL so (Distinct On) solutions won't work and neither will row_number/partition solutions either...
I seem to be having the same issue raised here SQLAlchemy: FROM entry still present in correlated subquery
回答1:
Finally Solved the issue. Hope this helps someone else trying to solve the first-n-per-group problem when custom ordering of group required using SQLAlchemy and mySQL.
First I have this function that returns the one student_list_status_id with highest priority for the project(hence the filter).
@staticmethod
def create_student_list_subquery(session):
'''create a correlated subquery that will limit result to one student
list per project with custom sorting to retrieve highest priority list
per project based on status'''
sl2=aliased(StudentList)
list_id = sl2.student_list_status_id.label("list_id")
_whens = {ProjectStatus.APPROVED: 1, ProjectStatus.REJECTED: 2,
ProjectStatus.SUBMITTED: 3, None: 4}
sort_order = case(value=list_id, whens=_whens)
return session.query(list_id).filter(sl2.project_id==Project.project_id)
.order_by(sort_order)
.limit(1)
The I join project status which correlates to the student_list_status_id in the query above (aliased as ps) onto the project. Then, I can sort on the project status name which was my goal.
self._session.query(models.Project)
.filter(models.Project.project_year == year)
.join(ps, ps.project_status_id==student_list_subq)
.all()
Note that student_list_subq refers to result of create_student_list_subquery function above.
来源:https://stackoverflow.com/questions/50109243/sqlalchemy-limit-join-result-to-one-row-for-one-to-many-relationship