AI, Machine Learning and Deep Learning
How does the AI terms come into Data Warehouse considerations? Many times, the warehouse is used for Business Intelligence where report and graphs are made, and humans makes decisions based on patterns they see. A human is limited to hold a few objects in his head a computer does not care if there are millions of objects.
​
An AI method is depending on large quantities of data, the more data the more accurate. In a warehouse we usually store large quantities of data.The purpose with a BI is often to see patterns that can tell us about the future and tell us what decisions we should make.
Machine learning algorithms could be a way to find patterns. Without understand them totally you can exercise data and test out the most secure way of seeing patterns you like.
For me it is a totally new way of thinking. You first need to know the output, the goal you want to achieve. Then you need to find the input that matches that goal. Then you train and test the input data with an ML algorithm to find the best way of exercising the data. When this is done you have a model that to can run predictions against. You never have to create any complex algorithms yourself.
​
Definitions:
-
Artificial Intelligence (AI) is any technique which enables computers to mimic human behavior.
-
Machine Learning (ML) is an AI technique that give computers the ability to learn without being explicitly programmed to do so.
-
Deep Learning is a subset of ML which make the computation of multi-layer neural networks feasible
How is it done in practice?
Mainly I am going to talk about machine learning technics that is the most common technic in DW. Most of the cases you have prepared and cleaned the data to fit BI.
​
The work process creating a machine learning project is:
-
Import Data (You often have good enough data in your DW).
-
Clean the data (could have been done in the ELT process).
-
Split data into training data and a test data set.
-
Create a model.
-
Train the model.
-
Make predictions.
-
Test and evaluate the model and make improvements
(I am going to add a simple code sample made in Python).
​
​
​​