Return to Blog


Feature Extraction


JUNE 27, 2019

Acquiring the best data with the right data acquisition hardware and software is a key step to monitoring critical assets, processes, efficiency, quality, and performance metrics. Through further analysis, this impactful information can be used to improve business operations. After acquiring this data, a number of preprocessing steps should be implemented to better prepare your data for analysis. This will allow you to identify anomalies, create more robust models and obtain more accurate predictions. Data preprocessing includes removing noise and unnecessary outliers, synchronizing data with its designated storage source, and handling missing values to prevent model bias through removing data or substituting values.

Once preprocessing is complete, the next step is to decide what inputs to feed into your predictive analysis model. This is where feature extraction comes into play.

Feature extraction is the process of transforming raw data into a more manageable set of informative and non-redundant values. Raw data sets can be too large to input into algorithms. Feature extraction is used to convert the complex textual and image data into numeric values that make algorithm calculations easier. Feature extraction also draws more meaningful attributes from raw data. These values and attributes have a large influence on the accuracy of predictions in machine learning and analytics.

There are many different methods of feature extraction that can be applied to your data. The specific application will help you best determine which approach to utilize.

To further highlight how to apply feature engineering to industrial applications, CTO, Dr. David Siegel, and Senior Data Scientist, Dr. Jia Cai, have provided some feature extraction scenarios:

Waveform Data/Vibration Analysis

When handling waveform data and vibration data, you must first determine if you are utilizing an existing predictive monitoring application with well-established feature engineering techniques or a newer, less-established industrial application.

For a new application, more general feature engineering methods should be employed for waveform data. These include converting the data into the frequency domain and calculating the energy in various frequency bands, as well as finding the magnitude of the top five to ten peaks in the spectrum. Signal processing and descriptive statistics methods can also help filter data further when utilizing general applications.

If your application has well-established feature engineering methods, the typical recommendation is to build upon the already established methods. For example, if you were assessing failure in rolling equipment bearings, you could perform envelope analysis and extract the magnitude around the bearing fault frequencies. This method would effectively establish the cause of early degradation symptoms in bearing faults. In a similar approach, if you were monitoring the condition of gear components, you could decrease the noise in complex signals through synchronous averaging and modulate the amplitude and frequency. Similar domain-specific methods are also available for shafts, motors, belts, and other components with known frequency or kinematic relationships. Even with domain-specific feature engineering methods, you can still combine general features with specific features to have a more complete set of descriptions and better define the condition of the machine or process.

Textual Data to Real Value Vector

Before feeding data into a machine learning algorithm, textual data needs to be converted to real numbers, either as integers or floating-point values. Below are two commonly used methods:

Bag-of-Words Model

The bag-of-words (BOW) model is one of the most common methods for document classification. This is when the frequency of a word is used as a feature vector. For this approach, a vocabulary of known words must be constructed. Then, each sentence or document is converted into a fixed length numeric vector representing the occurrence frequency of each word from this vocabulary.

For example, this method would be useful when you are collecting the status code and text data from a machine. If the frequency or count of certain words changes from the nominal behavior, BOW features and an anomaly detection method could flag this unusual behavior. Addressing unusual machine behavior reduces unplanned downtime and leads to faster troubleshooting.

Word Embedding (Word2Vec)

Word2Vec generates a vector space by taking a large body of text as input and delineating each represented unique word as a corresponding vector in this space. Unlike the BOW approach that ignores the meaning of each word, Word2Vec partitions the word vectors in the vector space, so that words with similar meaning in the text body are in close proximity to one another in the space. Word2Vec provides a better vector representation of text content than past methods.

For example, this specific technique would be a good approach to categorize and group maintenance text records. In many cases, the entries of these maintenance records are not always consistent. Grouping similar words with an intelligent algorithm facilitates reliability and creates more accurate maintenance reports. This process also aids model training in predictive monitoring, which is dependent on understanding the failure mode and health condition of the machine or asset.

Image to Feature Vector

Image to feature vectors are numeric vectors that represent the content of an image. There are several different ways to generate this feature vector. The feature that is chosen depends on the goal, such as predictive quality or predictive maintenance, and/or the type of information, anomaly, or defect you are trying to detect. Some common ways to develop an image feature vector are through color mean, color standard deviation, and 3-D color histograms. One of the most basic image feature vector representations is the raw RGB pixel intensity of an image, where each pixel of the image is represented by three numeric numbers. This is the easiest way for users to convert an image and doesn't require as many complicated steps to achieve.

Appropriate feature extraction is the key to building useful predictive models and deploying an impactful solution to improve your business through analytics.

For more information and weekly updates, follow Predictronics:

Know What Happens Next with PDX!