Training
What exactly do we mean by training? Training refers to using a learning algorithm to determine and develop the parameters of your model. In layman’s terms, if you’re training a model to predict whether the heart will fail or not, that is True or False values. You will be showing the algorithm some real-life data labeled True, then showing the algorithm again, some data labeled False. You will be repeating this process with data having True or False values, that is whether the heart actually failed or not. The algorithm modifies internal values until it has learned to tell from data that indicates heart failure that is True or not, that is False. With Machine Learning, we typically take a dataset and split it into three sets
- Training Sets: The Training subset is the data used to train the algorithm
- Validation Sets: The Validation subset is used to validate our results and fine-tune the algorithm’s parameters
- Test sets: The Testing data is the data the model has never seen before and used to evaluate how good our model is. We can then indicate how good the model is using terms like, accuracy, precision and recall.
Types of Learning
Supervised Learning
For instance, we can provide a machine learning program with a large volume of pictures of birds and train the model to return the label “bird” whenever it has provided a picture of a bird. When the machine model is shown a picture of a cat or a bird, it will label the picture with some level of confidence. This type of Machine Learning is called Supervised Learning, where an algorithm is trained on human-labeled data. The more samples you provide a supervised learning algorithm, the more precise it becomes in classifying new data.
Supervised Learning refers to when we have class labels in the dataset and we use these to build the classification model. We can split it up into three categories
- Regression: Regression models are built by looking at the relationships between features x and the result y where y is a continuous variable. Essentially, Regression estimates continuous values.
- Classification: Classification focuses on discrete values it identifies. We can assign discrete class labels y based on many input features x. Some forms of classification include decision trees, support vector machines, logistic regression, and random forests. With Classification, we can extract features from the data. Features are distinctive properties of input patterns that help in determining the output categories or classes of output. Classifier uses some training data to understand how given input variables relate to that class.
- Neural Networks: Neural Networks refer to structures that imitate the structure of the human brain.
Unsupervised Learning
Unsupervised Learning, another type of machine language, relies on giving the algorithm unlabeled data and letting it find patterns by itself. You provide the input but not labels, and let the machine infer qualities that algorithm ingests unlabeled data, draws inferences, and finds patterns. This type of learning can be useful for clustering data, where data is grouped according to how similar it is to its neighbors and dissimilar to everything else. Once the data is clustered, different techniques can be used to explore that data and look for patterns.
Reinforcement Learning
The third type of machine learning algorithm. Reinforcement Learning relies on providing a machine learning algorithm with a set of rules and constraints, and letting it learn how to achieve its goals.
You define the state, the desired goal, allowed actions, and constraints. The algorithm figures out how to achieve the goal by trying different combinations of allowed actions, and is rewarded or punished depending on whether the decision was a good one. The algorithm tries its best to maximize its rewards within the constraints provided. You could use Reinforcement Learning to teach a machine to play chess or navigate an obstacle course.
Deep Learning
Deep Learning is a specialized subset of Machine Learning. Deep Learning layers algorithms to create a Neural Network, an artificial replication of the structure and functionality of the brain, enabling AI systems to continuously learn on the job and improve the quality and accuracy of results. This is what enables these systems to learn from unstructured data such as photos, videos, and audio files.
Deep learning algorithms do not directly map input to output. Instead, they rely on several layers of processing units. Each layer passes its output to the next layer, which processes it and passes it to the next. The many layers is why it’s called deep learning. When creating deep learning algorithms, developers and engineers configure the number of layers and the type of functions that connect the outputs of each layer to the inputs of the next.
Then they train the model by providing it with lots of annotated examples. For instance, you give a deep learning algorithm thousands of images and labels that correspond to the content of each image. The algorithm will run the those examples through its layered neural network, and adjust the weights of the variables in each layer of the neural network to be able to detect the common patterns that define the images with similar labels.
Deep Learning has proven to be very efficient at various tasks, including image captioning, voice recognition and transcription, facial recognition, medical imaging, and language translation. Deep Learning is also one of the main components of driver-less cars.
Neural networks
Artificial neural networks borrow some ideas from the biological neural network of the brain, in order to approximate some of its processing results. These units or neurons take incoming data like the biological neural networks and learn to make decisions over time. Neural networks learn through a process called back-propagation. Backpropagation uses a set of training data that match known inputs to desired outputs.
First, the inputs are plugged into the network and outputs are determined. Then, an error function determines how far the given output is from the desired output. Finally, adjustments are made in order to reduce errors.
A collection of neurons is called a layer, and a layer takes in an input and provides an output. Any neural network will have one input layer and one output layer. It will also have one or more hidden layers which simulate the types of activity that goes on in the human brain. Hidden layers take in a set of weighted inputs and produce an output through an activation function. A neural network having more than one hidden layer is referred to as a deep neural network.
Input layers forward the input values to the next layer, by means of multiplying by a weight and summing the results. Hidden layers receive input from other nodes and forward their output to other nodes. Hidden and output nodes have a property called bias, which is a special type of weight that applies to a node after the other inputs are considered. Finally, an activation function determines how a node responds to its inputs. The function is run against the sum of the inputs and bias, and then the result is forwarded as an output. Activation functions can take different forms, and choosing them is a critical component to the success of a neural network.
Convolutional neural networks or CNNs are multilayer neural networks that take inspiration from the animal visual cortex. CNNs are useful in applications such as image processing, video recognition, and natural language processing. A convolution is a mathematical operation, where a function is applied to another function and the result is a mixture of the two functions. Convolutions are good at detecting simple structures in an image, and putting those simple features together to construct more complex features. In a convolutional network, this process occurs over a series of layers, each of which conducts a convolution on the output of the previous layer.
Recurrent neural networks or RNNs, are recurrent because they perform the same task for every element of a sequence, with prior outputs feeding subsequent stage inputs. In a general neural network, an input is processed through a number of layers and an output is produced with an assumption that the two successive inputs are independent of each other, but that may not hold true in certain scenarios. For example, when we need to consider the context in which a word has been spoken, in such scenarios, dependence on previous observations has to be considered to produce the output. RNNs can make use of information in long sequences, each layer of the network representing the observation at a certain time.