In his latest guest post, managing director of GumGum, Jon Stubley (pictured below), takes a look at computer vision and the impacts it’s already having on peoples’ lives. And that includes making better ads, too…
Last year’s launch of the iPhone X introduced millions of people to the concept of computer vision. They might not know the actual term for it (or the specifics of how it works) but they’d be aware that the new model phone ‘recognises’ faces as part of its Face ID sensor system due to clever technology that is able to make sense of what it sees.
What less people are aware of, however, is how much computer vision is already beginning to impact their lives (regardless of whether they have stumped up over $1000 for the iPhone X) – and its impact is only going to accelerate.
Computer vision designed clothes have yet to hit the aisles of K-Mart – but it is only a matter of time. During Melbourne Spring Fashion Week 2016, Couture designer Jason Grech worked with IBM Watson to understand the latest runway trends and designed a 12-piece collection called, ‘Cognitive Couture.’ Whilst undoubtedly a marketing stunt, it demonstrates how computer vision can be used in the retail fashion sector – in this instance analysing ten years of runway fashion and social media buzz to help the designer and his team explore and evaluate trends, colours and textures during the creative process.
Another IBM Watson powered initiative was used during the recent US Open Tennis Championships, when the United States Tennis Association (USTA) used Watson Media to automatically generate match highlights to be shared through social media. The machine analysed images and video, as well as language, sentiment and tone – for example, sounds from the crowd that signalled something special was happening on court, but also footage of winning plays, victory-like actions such as fist pumps by players, and facial expressions. I expect to see more large scale televised events using this technology going forward – it is able to manage raw video footage in near real-time and make the seemingly impossible very much achievable.
Enabled by the aforementioned iPhone X, University of Sydney doctoral student Mia Harrison and her colleagues released a short video set to the Queen classic “Bohemian Rhapsody” and starring animated iPhone X emoji using the iPhone’s “Animoji” feature. The feature enables the phone’s camera can capture your facial muscle movements using computer vision and then transposes them into facial expressions of animated emoji. This makes tech that was previously in the realms of the big move studios accessible to all – it is quite literally in your pocket. Animoji leverages the power of the iPhone X’s A9 processor and Apple’s ARKit (Augmented Reality Kit) software framework. (Google has a competing framework for Android called ARCore.) Incidentally, Mia Harrison’s work caught the attention of famed tech writer and former Wall Street Journal and Recode columnist who tweeted, “Genius iPhone X creation! And it’s only been out a couple of weeks. Watch this (or any of the thousands of other animoji karaoke videos that have already cropped up online) and imagine what’s coming with this technology.”
For some it’s a little too Big Brother-ish but the European Commission partly funded a trial to develop a computer vision system that detects suspicious behaviour in CCTV footage as it happens. The P-React project aimed to help catch criminals in the act and flag the relevant video clips to authorities. According to The New Scientist, the algorithms have been taught to highlight atypical behaviour and have been trained on example scenes of people fighting and people chasing someone who has snatched a bag. It promises to help with digital overload for the police – in cities there are so many CCTV cameras it can be hard to pinpoint the footage that has value for investigations and prosecutions.
Better ads and better insights
Computer vision is also helping making the ads we see more relevant – in a number of ways. Firstly, it is behind the ads that my own company enable – contextual ads that appear within or around editorial images on a page and that don’t interrupt the consumer on the page (they are CBA compliant). The tech is driven by computer vision that scans images, videos and the context around them – so an editorial image of a blonde film actress becomes a place for a shampoo formulated for blonde hair to be. Computer vision is also being used to better find out how consumers are actually responding to a new advert. Affectiva, an outgrowth of the MIT Media Lab, helps brands (including Mars, Kellogg’s and CBS) use computer vision and deep learning algorithms to make sense of viewer facial reactions as they watch content via its Emotion SDK (software development kit) that works across mobile devices and standard desktop webcams.
The integration of computer vision into all aspects of our day to day lives is sky rocketing. For marketers and publishers, it opens up a host of possibility that should be given careful consideration as the technology rapidly matures.