I recently began exploring ways to start creating machine learning projects, and found a great introductory video (linked below)! Although this post may not be very in-depth and I won’t be able to explain the syntax of the code, I hope that this is a good introduction to the process of creating a machine learning project, and that in the future, I will be able to explain how to actually understand the code. It’s natural when first beginning to code to follow tutorials online rather than create code on your own.
Basic Steps
- Import data: Machines learn from the data they are given, so it’s important that your data is reliable and accurate. An example of a website in which you can find data is kaggle.com, a website with many datasets, which is what this video used.
- Organize the data: Sometimes, you may have to clean the data by removing duplicated, irrelevant, or incomplete data. Additionally, the data might not be in the type you want it to be. For example, it may be given to you as words when you actually want it to be in numerical form.
- Split the data into training/testing sets: The training set is what the model learns from, and the testing set is used to check the model’s accuracy. A common ratio is 70-80% of the dataset used for training and 20-30% for testing.
- Select a model: The model you choose should align with the results you want. This means that you should understand if the model is for numerical or categorical data and pick it based on what kind of data you have.
- Train the model: The training dataset is given to the model so that it can find patterns and make predictions.
- Make predictions: Finally, the model can be used to make predictions when it’s given data from the testing set.
- Tuning the model: Your model’s accuracy might not be very high, so you can tune the parameters present. Parameters are variables determining the behavior of the algorithm. Model parameters are learned by the algorithm during training, whereas the person sets hyperparameters.
Conclusion
Although the process of creating a machine learning project may still seem complicated and abstract, I will soon create a post in which I show the code created by following the YouTube video below and how each step I listed above corresponds to the code created. See you in the next post!
https://www.simplilearn.com/tutorials/machine-learning-tutorial/machine-learning-steps