kNN is a unique machine learning algorithm with its own pros and cons. The good news is it’s very simple to understand and implement kNN making kNN a good option to include in the Machine Learning or Data Science toolbox.
1- Slow Prediction
There are multiple machine learning algorithms with slow runtime performance for training. kNN doesn’t really have a training process except every single prediction is a training process.
Speed-wise kNN struggles with inference phase (predictions) and doesn’t scale very well.
Because of its high complexity and need for training for every point being predicted, kNN isn’t a very big data friendly machine learning algorithm.
See kNN Complexity.
2- Can't Do Outliers
kNN algorithm also can’t handle outliers.
Outliers will cause trouble to kNN both from training perspective and prediction perspective because it relies heavily on distance calculations and doesn’t make predictions regarding dataset.
If data has outliers this will spoil distance calculations, if you’d like to predict an outlier this won’t be possible because it wasn’t introduced in training.
3- Missing Values
kNN also will not with missing values.
Since kNN is a holistic model tying each sample based on inter-related distance calculations, missing values threaten the whole system.
To address this issue there is a specific data imputer named KNNImputer in sklearn.impute module which can be used to pre-process data with missing values and make it ready for kNN machine learning algorithm.
4- No probabilistic reports
kNN is a non-parametric machine learning algorithm and kNN models don’t produce any probabilistic reports. Probabilistic reports tell the probability of a prediction in addition to the prediction and can be useful in scientific research or mode elaborate machine learning implementations. Probabilistic models can also be useful in hybrid machine learning implementations where another ML model is triggered not only based on the prediction but also the probability of the prediction.
For simple and straightforward machine learning tasks kNN still produces very satisfactory results for both classification and regression categories. You can see the advantages of kNN below: