Publication Date

Spring 5-31-2017

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Leonard Wesley

Second Advisor

Robert Chun

Third Advisor

James Casaletto

Abstract

The relationship between house prices and the economy is an important motivating factor for predicting house prices. Housing price trends are not only the concern of buyers and sellers, but it also indicates the current economic situation. Therefore, it is important to predict housing prices without bias to help both the buyers and sellers make their decisions. This project uses an open source dataset, which include 20 explanatory features and 21,613 entries of housing sales in King County, USA. We compare different feature selection methods and feature extraction algorithm with Support Vector Regression (SVR) to predict the house prices in King County, USA. The feature selection methods used in the experiments include Recursive Feature Elimination (RFE), Lasso, Ridge, and Random Forest Selector. The feature extraction method in this work is Principal Component Analysis (PCA). After applying different feature reduction methods, a regression model using SVR was built. With log transformation, feature reduction, and parameter tuning, the price prediction accuracy increased from 0.65 to 0.86. The lowest MSE is 0.04. The experimental results show there is no difference in performance between PCA-SVR and feature selections-SVR in predicting housing prices in King County, USA. The benefit of applying feature reductions is that it helps us to pick the more important features, so we will not over-fit the model with too many features.

Share

COinS