Tuesday, March 19, 2024

AI: The Curious Case Of Concept Drift

By Anand Tamboli

- Advertisement -

Artificial Intelligence or Machine Language applications do not sit in a black box. These evolve continuously and so must be supervised continuously. When a change occurs, we can fix it in a near real-time environment with the proposed algorithm and technique.

Think about a scenario where you are using an electric motor on your shop-floor. You have installed a few sensors on this motor’s outer body, and these sensors are continuously sending the data over the wireless network. Moreover, you also have an elaborate setup on the cloud where your application analyses these parameters and determines your motor’s health status.

Of course, this scenario considers the trained pattern of motor vibration and current consumption by the motor during the learning phase. Moreover, so long as nothing changes, this pattern recognition works like a charm. That is what a good machine learning outcome would look like.

- Advertisement -

However, the data generated by an electric motor can change over time. It can result in poor analytical results, which otherwise assumes a static relationship between key parameters and motor-health.

The change in data occurs due to various real-life scenarios. These scenarios range from changes in operating load conditions, aging of mechanical components such as ball-bearings, or wear and tear of the foundation on which the motor is installed. Environmental conditions can change, and several other factors may get affected too.

Nonetheless, this occurrence is quite common for several other real-life scenarios of machine learning. The data changes over time and affects statically assumed and so programmed relationships. In technical jargon, it is known as ‘concept drift.’ The word concept refers to an unknown and hidden relationship between the output and its input variables. These inputs are also known as ‘hidden context.’

What is the problem here?

For a static use case, concept drift does not pose any problem at all. However, in most use cases, the relationship between input parameters, that is, features and output characteristics, changes with time. If you have assumed this relation to be static during the machine learning model development, it could be an issue in the future.

This problem may be relatively easier to handle if you are maintaining these relationships and formulae on the cloud. You can update the new formulae and relationships, and everything will be back to normal.

However, if your application architecture is edge compute dependent and push learned models to the edge sensors for faster responses, this (new) learning must be transferred regularly. Interestingly, in industrial implementation, edge computing utilisation is highly recommended.

The challenge is how to update these formulae in low-cost sensors? Especially when these sensors do not have large memory, or air firmware (OTA) updates are not feasible. How can you send and update only that basic formula to one sensor device, if needed?

If the solution is designed right, it would have accounted for such drift in the first place and would adapt to the changing scenarios. If that is not the case, either your solution would fail, or the performance will degrade. It may also mean lower accuracy, lower speed, high error margins, and other such issues.

However, remember that it is difficult to identify any scenario in which concept drift may occur, especially when you are training the model for the first time. So, instead of working on pre-deployment detection, your solution should safely assume that the drift will occur and accordingly have a provision to handle it.

Your solution cannot handle drifts; you will have to fix it eventually on an ongoing basis. It is a costly proposition as you keep spending money on constant fixes. The risk of tolerating poor performance for some time exists. This risk increases if the detection of drift is not apparent.

And if you fail to detect the drift for a more extended period, it can be quite risky on several fronts. Thus, the detection of drift is a critical factor. Check if your solution has accounted for concept drift and, if yes, then to what extent. The answer to that check would tell you the level of risk.

Some challenges in handling concept drift

The first and obvious challenge is to detect when it occurs. I recommend two possible methods to handle it:

1. When you finalise a deployment model, record its baseline performance parameters. After deployment, periodically monitor these parameters. If you see any difference, and if it is significant, it could indicate potential concept drift. In that situation, take action to fix it.

2. The other way to handle it is to assume drift will occur. It means you will put a plan in place to periodically update the cloud model and the sensors or edge network. The challenge, in that case, is to handle the edge sensor updates without significant downtime.

The first is easier to manage. However, the second poses a technical problem—so it becomes essential to have an in-built sensor capability that accepts model (formulae) updates without updating complete firmware. And this is where you must use the dynamic evaluation algorithm.

The dynamic evaluation algorithm

The basic algorithm was first introduced in 1954. It was first used in HP’s desktop calculators in 1963. Now, almost all the calculators deploy this algorithm.

This algorithm relies on a specific type of representation of the formula, known as Reverse Polish Notation (RPN) or Postfix Notation.

The notation result is always context-free. It means once we convert an equation in Postfix Notation, it becomes easier for a computer to evaluate it. An outside-in evaluation sequence is used to perform this computation.

If you would like to learn more about this algorithm and how it can be implemented at a sensor level, please check https://www.electronicsforu.com/electronics-projects/hardware-diy/calculator-using-postfix-notation.

Other scenarios of concept drift

I explained concept drift with an example of an electrical motor. Nonetheless, the problem is not limited to this use case. Many other applications are vulnerable to this phenomenon and resulting problems.

For example, in credit card spend tracking and fraud detection algorithms used by banks consumer spending patterns change all the time.

For a security surveillance application used in public places, the visitor pattern can show seasonal or permanent change over time (see what happened with Covid-19). Advertising, retail, marketing, health applications are equally prone.

If your sensor network is monitoring server room temperature and humidity, a new cabinet or rack addition can also affect the pattern of change of these factors.

Summary

It is unrealistic to expect that data distributions stay stable over a long period. The perfect world assumptions in machine learning do not work in most cases due to changes in the data over time, which is a growing problem as these tools and methods increase.

The key to fixing it is acknowledging that AI or ML applications do not sit in a black box. These evolve continuously and must be supervised at all times. When the change occurs, we can regularly fix them in a near real-time environment with the proposed algorithm and technique.


Anand Tamboli is a serial entrepreneur, speaker, award-winning published author, and an emerging technology thought leader

SHARE YOUR THOUGHTS & COMMENTS

Electronics News

Truly Innovative Tech

MOst Popular Videos

Electronics Components

Calculators