We all know how science fiction movies tend to depict computers or robots which are able to interact with human beings. These robots are often much more logical than humans and have a complete understanding of the world we live in. In order for robots to behave in this way they require the ability to see and process their surroundings.

Object recognition is the area of study within artificial intelligence which focuses on making computers recall objects which they have already seen. Early attempts remembered objects in terms of their shapes and colours, but unfortunately, moving the object even slightly had dramatic effects on the success of these attempts.

Consequently, researchers looked elsewhere. Unlike machines, people can recognise objects from different angles, in different light conditions, and even if slightly hidden. It was thought that humans do not recognise objects as a whole, but recognise objects by their parts, which collectively suggest the object. A viewer recognises an object if many of the object’s parts are visible.

It was further hypothesised that these parts are not based on colour or geometry, but on certain key points of the object. Over a decade ago, a technology called scale-invariant feature transform (Sift) was developed which embraced this idea of mimicking human vision, and the results were remarkable. Although it was rather slow, Sift outperformed all other techniques and set a new standard in the field.

Research in object recognition technologies has continued, however, modern algorithms still follow Sift and the inspiration it was originally based on. One of the more well-known of these techniques is speeded-up robust features (Surf), which attempted to keep the same quality of results as Sift while taking many shortcuts to improve the speed of recognition.

The object of my research was to evaluate Surf and Sift in the presence of different visual changes: rotation, resizing, lighting and quality. Both methods still recognised objects even after being degraded by these changes, and kept recognising 80 per cent of the objects. These results can be further improved by using techniques which detect some of the mistakes that were made. Overall, the research could not determine a great distinction between using one algorithm over the other. Other studies had concluded similarly, both when comparing just these two techniques and also when including others too.

This research showed how computers are now able to recognise objects in the presence of linear transformations and degradation. However, in the real world, recognition takes place over changes known as projective transformations, which include linear transformations but also many other changes such as skewing. If researchers were to obtain the same results, but under projective transformations, computers could finally be able to understand our reality.

This research project and the related Master’s were carried out following the award of a Steps scholarship, which is part-financed by the European Social Fund.

Sign up to our free newsletters

Get the best updates straight to your inbox:
Please select at least one mailing list.

You can unsubscribe at any time by clicking the link in the footer of our emails. We use Mailchimp as our marketing platform. By subscribing, you acknowledge that your information will be transferred to Mailchimp for processing.