QUA4A5 Group Project
You should produce both Microsoft Excel and a Microsoft Word Document words limit (max 1800). All of the work described above should be done using Excel and described and interpreted in Word. You should paste in tables of your results and any figures (graphs) from the spreadsheet into your Word document. It is important that you include suitable commentary and that you draw conclusions that are useful to the commercial operations of TfL.
The approximate marking weight for the various parts of the assignment are given above in [Bold square brackets] so that you can assign the appropriate amount of your time and effort to each part.
words limit: max 1800
Tfl Case study assignment –
You are the new Director of Operations for Transport for London’s (TfL’s) ‘Santander’ Bike Hire Scheme. The docking machines all over London secure bikes that have been returned to the docks by customers, but they are then not in the optimum place at the start of each day for the customers to hire them again. So every night, TfL re-distributes bikes back to the docks where they are needed. The number of bikes hired each day is automatically recorded on each day and you have the data for each day’s hires between August 2010 and July 2014. In addition, daily temperature data has been included for each day of the recorded sample. Your job involves analyse this large dataset to address the following task. You will consider this large dataset that you have been provided with as a population.
- What are the average daily number of bikes hired, the standard deviation and the 95% confidence interval for this population bike hires? [5%]
Using some of these statistics and the maximum and minimum number of daily hires, can you determine whether this population of daily hires is normally distributed or not? Illustrate your answer graphically. [10%]
Why is a normally distributed population important for data analysis? [5%]
- Now use the RANDBETWEEN function in Microsoft Excel to select three simple random samples of 10, 50 and 500 days from the population. Using just your samples, calculate the mean, median, standard deviations and mode if applicable for each random sample. [10%].
Compare these outcome figures. How do the mean and SD change with sample size? [10%].
- Now use Forecast function in Microsoft Excel for the last part of the data set to predict the next value based on a linear trend for the No. of hired Bicycle
|No.||Seasons||Number of Bicycle Hires|
Comment on your results. How might this data be used to efficiently to predict the required future resources within TfL? [5%]
- Create a Pivot table in Excel to calculate the average number of hires on each day (Monday to Sunday) over the population and summaries this in a table. Comment on your results. How might this data be used to efficiently allocate resources within TfL? [10%]
Create a Pivot table in Excel to calculate the sum number of hires on each season (winter, spring, summer and autumn) over the population and summaries this in a table. Comment on your results. How might this data be used to efficiently allocate resources within TfL? [10%]
Create a Pivot Report and a Pivot Chart of the best scenario that this data set can present also use slicer in Excel Comment on your results. How might this Report and chart used to efficiently allocate resources within TfL? [10%]
- Included in the dataset is a record of the average temperatures for each of the days in the study.
Using correlational analysis, establish whether there is a relationship between the temperature and the number of bikes hired. Think about how you might visualise this relationship in Excel. [10%]
Using Regression Analysis, further explore the relationship between temperature and bicycle hires. Think about the key aspects of the analysis and how you would comment on the outcomes. Interpret these outcomes and produce an appropriate narrative to accompany the statistics. [10%]
Comment on the limitations of the data. How could data collection be improved to provide a more accurate analysis? [5%]