A technology is described in which images initially classified according to an estimate of importance (eg, according to text-based similarities) are again ranked according to the visual similarity with a user-selected image. A user-selected image is received and categorized into an intention class, such as a stage class, a portrait class, and so on. The intent class is used to determine how the visual characteristics of other images are compared to the visual characteristics of the image selected by the user. For example, the comparison operation may use a different function weight depending on the kind of intent that was determined for the image selected by the user. The other images are re-sorted based on their similarity calculated with the image selected by the user and are returned as query results.