Computer Vision: What Are The Advantages And Disadvantages 

Fei-Fei Li, Director of Stanford's Human-Centered AI Institute

 

June 29th, 2007 marked the release of the Apple iPhone, a technological innovation that brought smartphone usage into mainstream popularity. While it was a powerhouse in a class of its own twelve years ago, when compared to present day descendants the original iPhone is a brick. However, the utility and functionality of the first compared to current aren’t up for discussion here.

The technological advances made because of mobile devices like the iPhone are what’s worth talking about.

Computer technology has always been a rapidly growing field, outdating innovations within a short number of years. These sorts of advancements range from wireless charging to ever-increasing memory stores. A single smartphone now is multitudes more powerful than all the technology put together that landed man on the moon.

Out of all the trends that have come since the advent of mass smartphone usage, such as home tech synching, mobile payment, and virtual reality, one particular trend has been gaining traction above all others: artificial intelligence.

AI has launched itself as one of the forefront trends when it comes to providing a unique and intuitive customer experience. The desire to augment our everyday lives through technology has resulted in a breakthrough four decades in the making, evident in services like Google Lens, Snapchat, Amazon Go, and more. This augmentation is commonly known as computer vision.

 

What Is Computer Vision?

Simply put, computer vision is an AI field that takes and develops advanced image processing software for near limitless uses. The human eye can only process so much information before falling short of expectations and requirements. By training computers to effectively analyze the real and virtual world through images and photos, vast amounts of high-yield information can be gathered at a greater pace than humans can on their own.

 

What is computer vision?

 

Computer vision as a machine learning concept is not new. As early as the 1950’s, using technology to analyze environmental objects and systems and make educated information-based deduction from them. In its infancy this was as identifying something round or square before moving onto distinguishing between typed and handwritten text.

The problem was the technology of the time restricted computer vision and machine learning from their full potential. It wasn’t until much more recently that data sets and algorithms have become advanced enough for computer vision applications to have practical usage. This is proving to be exceedingly useful with technology dealing with built-in mobile cameras.

Nowadays, companies like OpenCV who were the tip of the spear for modern computer vision software provide readily accessible services for everyone.

 

How Does Computer Vision Work?

As a human, processing images is a natural process that we tend to give little to no thought to in our daily lives. Computers, on the other hand, have to abide by their own unique processes in order to analyze massive amounts of media. In order for deep learning computer vision to take root, thousands on thousands of photos, videos, and other images need to be compiled for an effective AI to become useful. However, once sufficient data has been gathered, how does computer vision actually work to understand what it’s seeing?

Think of it like the software deconstructs the images. Rather than viewing it as a whole, it divides an image into a large grid of boxes - otherwise known as pixels - and assigns a number to each box.

 

null

 

Source: Openframeworks

By breaking each image down to base components labelled by numeric value, the computer is then able to interpret and store each image based on the resulting array of numbers. The image on the left is what is provided, while the image on the right is how the computer actually processes the image.

The numerical value for each pixel can range from 0-255 in a basic image like the one provided, but that’s only because it’s in grayscale. One you throw in color, things get a little more complicated.

When accounting for color, the numerical value per pixel jumps from one to three. Every color can be generated through a mix of the three primary colors: red, green, and blue. In order for the computer vision software to properly interpret a color image, it needs to apply a numerical value to each of the three primary colors in the image - hence the three numerical values.

 

null

 

Source: Xaraxone

Once again, each pixel only has a numerical range between 0 and 255. It’s a whole lot more information to process in comparison to a simple grayscale image, which means for computer vision to properly work there needs to be a large amount of images already trained into the program. This is where deep learning computer vision comes back into play. The more images that have been processes, the more accurate the software will be.

 

Examples of Computer Vision in Mobile Apps

As we’ve already stated, technology has finally progressed to the point where mass usage of computer vision has become viable. Mobile apps have become powerful and complex enough to facilitate computer vision applications in ways that are already affecting your life. Real time reaction and learning have provided a means for a variety of new user experiences, both for fun and practicality. For example:

1. Snapchat filters

Snapchat is one of the most notable appliers of computer vision for entertaining their user base. From dog ears to flower crowns to rainbow waterfalls, Snapchat provides a variety of ways to alter your face. This is, of course, possible through the relatively recent advent of a computer vision application that can manipulate images in real time.

The filters work by Snapchat analyzing your face and, in a matter of seconds, recognizing and quantifying your features and structures. The human face has a few landmarks that provide excellent jumping off points for this process, including your nose, mouth, eyes, and eyebrows. Once your face has been mapped out, Snapchat draws on its deep learning to equate your features to an “average face”.

The “average face” is the most important part for the real time filters, as the computer vision creates a mesh that overlaps with your facial structure. From there, the algorithm can react and manipulate its selection of filters to correspond to how your face changes.

 

2. Amazon Go

Imagine a convenience store where the shopping process has been perfectly streamlined. You walk in, grab what you need, and walk out without ever bothering with a cashier. While that may at one time have been more of science fiction than reality, Amazon has delivered, through the power of computer vision and machine learning, exactly that.

Amazon Go is a collaboration of app and store, you’ll need one to get into the other. It uses computer vision to keep track of stock, maintenance, and every customer in the store to ensure security and effectiveness. Their cameras and sensors, located around the store, detect and connect everyone in the store to their Amazon account, while simultaneously keeping stock of every item that each customer is currently carrying.

In a nutshell, it’s impressive and only attainable through this specific AI technology. As soon as you’ve finished shopping, you can walk straight out the door and Amazon will automatically charge your account for everything you’ve taken with you.

 

3. Pinterest Lens

Rather than focusing on real time movement like Amazon and Snapchat have gone for, Pinterest focuses on what it does best: connecting you with your interests. All it takes is snapping a photo of something you like in the world, such as a car, plant, or artwork, and Pinterest Lens immediately routes you towards anything inspired by that interest.

As always, AI technology is the critical component to making this work, and does so through a comprehensive deep learning computer vision backlog. Pinterest is nothing but images, an enormous catalogue of information that feeds and informs their algorithm. Said algorithm deconstructs, analyzes, and then compares the image you took with thousands of others on Pinterest and across the web.

 

4. Amazon Echo Look

Rather than focusing on music and audio functions like others in Amazon’s Echo line of products, the Amazon Echo Look is dedicated to fashion. This includes voice-activated camerawork, requested styling advice for your outfits, and detailed cinematography for capturing the best picture.

Chances are you can see where computer vision comes into this. Not only does the Echo Look analyze your outfits while affecting your surroundings to create a photogenic likeness, its AI components even help you accentuate your look. It also keeps track of what’s in your wardrobe, categorizes your clothing, and suggests what you can buy from Amazon to complete your look.

The Echo Look algorithm derives its deep learning knowledge to leverage the experiences and feedback gathered from its consumers to build a stronger network dedicated to fashion design and stylization. It needs to take numerous factors into account to get this right: size, skin tone, color, what’s available, so on so forth. Computer vision and machine learning are what make it all possible.

 

Advantages of Computer Vision

The computer vision advantages that come with the territory fall under a fascinating amount of headers. Nearly every sector, both private and public, can benefit from using computers to track, analyze, and interpret the world around them. As more powerful organizations come to realize what computer vision and machine learning can bring to the table, the more we’ll see this AI technology affecting our lives.

 

Improved Online Merchandising

Online merchandising has traditionally relied on tagging to find what the customer is working for. A product, such as a backpack, may come attached with various keywords like “bag,” “blue,” “polyester,” or “cotton” to name a few to help narrow down the search to the right one.

It’s not the most efficient system, but it’s what we’ve been working with for years. However, computer vision helps loosen up that process, making it easier and more accessible for customers to find exactly what they’re looking for.

Rather than rely on tags to rotate between different styles of product, computer vision instead compares the actual physical characteristics in each image. This application means customers will be able to find search via images to find similar styles to what they’re looking for.

 

null

 

Source: Sentient Aware

Unique Customer Experiences

Services like Snapchat and Animoji are aimed to provide an experience that can only be considered “unique.” The goal is provide an appealing, entertaining, intuitive product for consumers to return to. Computer vision, especially in facial mapping, augmentation, and manipulation, was unheard of in the mainstream market up until recently.

 

Real-world Product and Content Discovery

As Pinterest Lens exemplifies, concepts across the entire internet and even the real world can become connected through the power of computer vision. A single photograph of anything you’d like opens up a search that brings your interests directly to your doorstep.

Whether you’re looking to buy a similar product or discover new ideas similar to what you’re looking for, services like Pinterest Lens and Facebook can bring that experience to you.

 

Seamless Store Experiences

Amazon has already demonstrated this concept to full effect. No more waiting in long lines, dealing with cashiers, or worrying about handling your wallet when it comes time to pay. The store experience, amplified with computer vision, creates a seamless, efficient environment to do you shopping in. The keyword here is convenience, both for the customer and the company.

 

Augmented Reality

When Google Glass came out, it was hallmarked as being the next big innovation in how technology impacts our daily lives. Granted, Google Glass wasn’t the greatest success story, and it wasn’t off the mark. Augmented reality is the concept of overlaying our daily lives with information provided by the internet and our phones.

For instance, let’s say you wanted to buy a new bike. Rather than going through the time consuming task of searching for information on that bike, computer vision can use augmented reality to provide reviews, facts, and stats about the product immediately.

Services like Google Translate are already making use of this function, providing a means of translating language in real time on your phone. Other companies, like Apple, are diving into the possibilities as well, researching the potential that augmented reality can bring.

 

Disadvantages of Computer Vision

While plenty of laurels rest on the head of the future for computer vision, every new innovation has its drawbacks. The computer vision disadvantages regard a hefty issue in the modern age: privacy.

The driving force that makes computer vision as effective as it is is the same issue that lead consumers to doubt whether it should be pursued. By gathering and learning from thousands and thousands of photos, videos, and other pieces of information, everything you do is stored online somewhere, owned by corporations or freely visible to everyone.

With the ability to recognize people’s faces, as well as track their whereabouts and habits, computer vision has changed the future of privacy. As this AI technology becomes more prevalent, users will need to become more aware of what sort of data they put out into the world. Computer vision searches and analyzes countless images and videos, and chances are that means you’re going to be in some of them.

 

What Should You Take Away From This?

Computer vision has the potential to change the landscape of every field you can imagine: healthcare, gaming, security, and so on. The average citizen’s lives can be made easier with easier shopping, augmented reality for better informed decisions, and greater connectivity to the world altogether, both physical and digital.

However, it also means that you’ll need to be more careful about what you put out onto the internet. As privacy becomes less and less private, sensitive materials that you may not want people to find, such as addresses, accounts, and other personal information, needs to be kept more secret than before.

The bottom line is that computer vision has both advantages and disadvantages. While it’s an incredible technology, it’s also worth a healthy amount of skepticism. New advances are never quite right the first time around, after all.

 

Meet with Lance V. Director of Mobile Practice

Lance can walk you through how to integrate your mobile strategy into your business seamlessly.

Lance Mobile CC

You may also like:

Post Your Comment Here