I have a dataset of articles containing french and english content. My final goal is to have similar articles that are in same cluster.
What I did first is word embed