Chapter 7 Conclusion

7.1 Overview

In this project, we aim to analyze the job market of data related jobs. In order to do so, we collected data mainly from three sources:

  1. Job market report from Glassdoor: mainly used to (1) compare statistics of data related jobs and other jobs (2) capture temporal trend of data related jobs’ salary.

  2. Indeed job search results: mainly used to (1) analyze skill sets for different data related jobs (2) analyze regional distribution of data related job postings.

  3. City data: mainly used to explore potential factors affecting the median base pay of data related jobs at different locations.

After the data cleaning and data analysis, we have many fruitful findings:

  1. The salary of data analysts, business analysts and financial analysts are similar, while the salary of data scientists is much higher.

  2. Over the past two years, the salary of four aforementioned data related jobs does not change too much. Yet, the salary of data scientists fluctuates the most, and it drops at the beginning of 2019.

  3. The below advice is purely based on the salary: if you want to be a data scientist, we suggest you work in SF and Seattle and we don’t recommend Houston and DC. We have neutral opinions on other cities.

  4. California and New York provide most data related jobs in the U.S. California provides more data scientist positions, while New York offers more analyst positions.

  5. The most relevant skill of data scientists is machine learning, and some other data related skills including data mining, visualization and deep learning, which also appear frequently in the job description. Analyst jobs are less technical compared to data scientists. As for analysts, more emphasis is placed on business sense, project management skills and problem-solving ability.

7.2 Limitations and future work

  1. Due to the lack of historical data of number of jobs openings, we are not able to analyze its temporal trend. If we can gather the corresponding data later, we will be able to visualize and analyze the historical trend of data related job openings and verify whether data industry is indeed emerging and offering more and more positions.

  2. Due to ethical and legal concerns, we decided to scrape a limited amount of job search data from Indeed. If we later gain access to Indeed API, we can conduct the analysis on a larger data set and give more comprehensive insights.