The Anaconda Python 3.0 distribution was used to accomplish the project. In addition, the following python libraries have been implemented:
I was curious to look into the AirBnB dataset for Seattle. I needed to discover more about pricing patterns, customer feedback, and pricing forecasting. Some of the questions I’ve looked into are:
According to the chart above, the peak months are June through August, with July being the highest. With summer in full swing and low potential of rain, the chart validates my hypothesis that these months in Seattle have the optimum weather.
Furthermore, it looks likely that the year begins gradually, with the minimum average price in January. Prices begin to rise again around April/May respectively, as we approach Spring and the holiday season.and November/December for Winter holiday.
According to the above analysis, pricing variations between neighbourhoods are unavoidable. With an average price of $231, the Southeast Magnolia area appears to be the most expensive of all.
Followed by Portage Bay at $227.
Rainier Beach appears to be the cheapest, with an average price of $68.
We concentrated on the top 5 most expensive neighbourhoods from the above analysis, along with Houses and Apartments, because we recognize they make up a significant portion of property types based on the previous analysis.
Houses in Portage Bay are the most expensive, followed by Houses in West Queen Anne and Westlake, as seen above. It’s worth noting that in Westlake, both houses and apartments are almost the same price.
Some of the best-rated neighbourhoods include Roxhill, Cedar Park, and Pinehurst. University District, Holly Park, and View Ridge are the neighbourhoods with the lowest rankings.
It’s worth noting that the majority of the reviews with low polarity ratings appear to be written in a language other than English!. Maybe the Sentiment Intensity Analyzer has this limitation.
The other three reviews appear to be genuine complaints, with users lamenting the lack of A/C and fans, the host’s rudeness, construction noise disrupting people’s stay, and the place’s terrible state, among other things.
The analysis performed in order to investigate the dataset, data preparation and wrangling, and the creation of prediction models in order to answer the questions above are all documented in the Jupyter notebook. Markdown cells are included in the notebook to aid in the documentation of the procedures as well as the communication of findings based on each analysis.
For reference an HTML version of the notebook is also available.
Lastly, the seattle folder contains the dataset from Kaggle (https://www.kaggle.com/airbnb/seattle). Finally, the dataset from Kaggle(https://www.kaggle.com/airbnb/seattle) is contained in the seattle folder.
It consists of three files:
The following are among the most major findings from the analysis:
I have written a blog on website Github Page about the project and observations. Link is down below https://abdishakury.github.io/
Kudos to AirBnB for uploading the dataset and Kaggle for hosting it; the dataset can be found here: https://www.kaggle.com/airbnb/seattle
SentimentIntensity Analyzer Reference: https://www.nltk.org/api/nltk.sentiment.html
Heatmap Reference: https://seaborn.pydata.org/generated/seaborn.heatmap.html