10 Machine Learning Project (Thesis) Topics for 2020

Are you looking for some interesting project ideas for your thesis, project or dissertation? Then be sure that a machine learning topic would be a very good topic to write on. I have outlined 10 different topics. These topics are really good because you can easily obtain the dataset (i will provide the link to the dataset) and you can as well get some support from me. Let me know if you need any support in preparing your thesis.

You can leave a comment below in the comment area.



1.  Machine Learning Model for Classification and Detection of Breast Cancer (Classification)

The data is provided by the Oncology department and details instances and related attributes which are nine in all.

You can obtain the dataset from here


2. Intelligent Internet Ads Generation (Classification)

This is one of the most interesting topics for me. The reason is because the revenue generated or expended by ads campaign depends not just on the volume of the ads, but also on the relevance of the ads. Therefore it is possible to increase revenue and reduce spending by developing a Machine Learning model that select relevants ads with a high level of accuracy.  The dataset provides a collection of ads as well as the structure and geometry of the ads.

Get the ads dataset from here


3. Feature Extraction for National Census Data (Clustering)

This looks like big data stuff. But no! It’s simply dataset you can use for analysis. It is the actual data obtained from the US census in 1990. There are 68 attributes for each of the records and clustering would be performed to identify trends in the data.

You can obtain census the dataset from here


4. Movie Outcome Prediction (Classification)

This is quite a tasking project but its quite interesting. Before now, there exists models to predict the ratings of movies on a scale of 0 to 10 or 1 to 5. But this takes it a step further. You actually need to determine the outcome of the movie.  The data set is a large multivariate dataset of movie director, cast, individual roles of the actor, remarks, studio and relevant documents.

You can get the movies dataset from here


5. Forest Fire Area Coverage Prediction (Regression)

This project have been classified as difficult but I don’t think so. The objective to predict the the area affected by forest fires. Dataset include relevant meteological information and other parameters taken from a region of Portugal.

You can get the fire dataset from here


6. Atmospheric Ozone Level Analysis and Detection (Clustering)

Two ground ozone datasets are provided for this. Data includes temperatures at various times of the day as well as wind speed. The data included in the dataset was collected in a span of 6 years from 1998 to 2004.

You can get the Ozone dataset from here


7. Crime Prediction in New York City (Regression)

If you have watched the movie, ‘Person of Interest’ directed by Jonathan Nolan, then you will appreciate the fact that there is a possibility of predicting  violent criminal activities before they actually occur. Dataset would contain historical data on crime rate, types of crimes occurrence per region.

You can get the crime dataset from here


8. Sentiment Analysis on Amazon ECommerce User Reviews (Classification)

The dataset for this project is derived from user review comments from Amazon users. The model should be able to perform analysis on the training dataset and come up with a model that classifies the reviews based on sentiments. Granularity can be improved by generating predictions based on location and other factors.

You can get the reviews dataset from here


9. Home Eletrical Power Consumption Analysis (Regression)

Everyone uses electricity at home. Or rather, almost everyone! Would is not be great to have a system that helps to predict electricity consumption. Training dataset provided for this project includes feature set such as the size of the home, duration and more

You can get the dataset from here


10. Predictive Modelling of Individual Human Knowledge (Classification and Clustering)

Here the available dataset provide a collection of data about an individual on a subject matter. You are required to create a model that would try to quantify the amount of knowledge the individual have on the given subject. You can be creating by trying to also infer the performance of the user on certain exams.

You can get the dataset from here

I hope these 10 Machine Learning Project topic would be helpful to you.

Thanks for reading and do leave a comment below if you need some support



Kindson Munonye is currently completing his doctoral program in Software Engineering in Budapest University of Technology and Economics

View all posts by kindsonthegenius →

2 thoughts on “10 Machine Learning Project (Thesis) Topics for 2020

Leave a Reply

Your email address will not be published.