
As described before dimensional expansion is a simple technique to create a high dimensional dense data structure for machine learning applications. It is useful for large data samples, like large sequences as presented in the previous example. Large image data could also be transformed to bring close together related attributes.
To test that hypothesis a dataset composed by the AIRS NASA data is used to test its ability to find an accurate representation of the dataset. The data consists of a series of temperature, pressure, ozone, cloudiness, and other variable scans. The data is sampled at 1-degree precision resulting in an array of shapes 180X360. Then this dataset can be reshaped into an array of shape (32,32,64).
Applying a simple convolutional variational autoencoder result in the ability to encode and decode the data. To better understand the nature of the learned representation and how might impact the output data a latent walk is applied to the model. Resulting in a series of images that reconstruct the data.
However, no specific pattern or cluster can be found from the learned dimension, this specific case is true for pressure data.
Autoencoding other data sources resulted in weak clustering the different samples within the learned representation.
Also, the latent walk shows recognizable patterns that change in the same direction as the clustering axis.
The previous two examples show how a single data source can be used to train a simple autoencoder and obtain a small representation of the data. Also, the learned representation shows that learns specific time-related changes that could be used for further applications.
The specific time scale obtained from this analysis could be used as a general time scale to improve weather and environmental modeling. Yet the specific identity of such a scale is not presented or investigated at the moment.
Now you have an example of how to use a simple technique to analyze climate data. And how to extend it with minimal changes in the code. As always the complete code for this post can be found on my github by clicking here. While a live example over Kaggle can be found here. See you in the next one.