In the history of life on earth, animals gained the gift of organic vision approximately 700 million years ago. Over time, we were able to use technology to invent the camera and machine vision, with functions similar to human eyes. To use these visuals and take decisions based on various situations in which things could get involved needs higher intelligence.
With the Internet of Things (IoT), it is now possible to bring in awareness to such digital eyes and keep ourselves informed about something out of the ordinary or unfamiliar happening around us. Uncanny Vision, a startup founded by Ranjith Parakkal and Navaneethan Sundaramoorthy, has a product revolving around this innovative idea of teaching the IoT device to identify objects using an analytical system powered by deep learning and smart algorithms. At the grassroots level, the system needs to be taught about various images and visuals that it needs to familiarise with for it to understand how these normally appear. In essence, it is like teaching a child about the world and correcting the perception in order to set proper benchmarks.
Learning the unfamiliar or uncanny
In the biological world, vision is crucial for survival. Visual data helps us make better decisions on various activities. Take the example of a surveillance camera designed to watch over a specific place frequently visited by people on the street. The camera seems to have no other objective apart from staring at a fixed point-of-interest. In the case of a mishap, what if this camera could trigger an important push notification to the nearest police station and help the police arrive in a jiffy? Now that could only be accomplished if an extra awareness feature is added to the set-up so that it can carry out better surveillance while simplifying it for the human involved in decision making.
Uncanny Vision works on two prime areas. One is on the edge, which is at the user’s end and the other on the IoT-Cloud infrastructure. This combination is very powerful when it comes to making human-like decisions. The whole system is like an intelligent being that can recognise anything it is programmed to see—even a moving object.
When we say programmed to see, there are a couple of ways that this can be achieved. One way is to train a model (neural network) with thousands of labelled images of target objects that you would like the camera to recognise, for example, gun, helmet, people or wallet. This trained model is optimised and programmed onto the camera hardware. The camera can then recognise objects with reasonable accuracy.
Another way is to have a self-learning camera. In this scenario, the camera (pre-loaded with an untrained model) is given a certain amount of video data and is told that everything in that sample data is normal behaviour. The model learns the definition of normal with the given sample data. The camera is then switched to surveillance mode. It watches and flags any behaviour that it has not seen before, as an anomaly.
An excellent application of this would be to track human movement in an automated teller machine (ATM). These places are often visited by people to swipe cards and get cash out. It might not look like a fast-moving assembly line, but on any given day thousands of people visit ATMs. Let us take an example. If you approach an ATM and do not stand in the position that the system normally recognises, it flags it as an anomaly or unfamiliar sighting.
Smart surveillance is achieved here by giving intelligent sight to the camera itself. If you normally withdraw money and leave, the system does not trigger an anomaly, but if you try to reach out to the ATM in an unfamiliar way, the analytical system evaluates the events based on a scale, which spans from red to green.
Any normal human activity that is expected, for example, standing or walking, would be in the green zone, whereas an unfamiliar or unexpected activity such as you falling down, crouching down or with arms raised up above the head would be in the red zone.
When the system realises that an activity has triggered the red zone, an alert is sent to the responsible personnel or emergency system that it has been programmed to contact. The system can also be taught to detect unauthorised objects that people wear such as helmets or headscarves for early detection of normal behaviour.
Algorithms that learn and adapt
Deep learning may sound fancy but it forms the heart of the system. Uncanny Vision’s computer vision has 70-plus algorithms that cover everything from pedestrian detection, vehicle detection, histograms or holography estimation, and is faster than OpenCV (an open source software used for image processing).