2021Fall_IE342_MiniProject2 5 (Instructor: Azadeh Haghighi)
Mini Project 2
Directions:
In this project, you are going to graphically present, study, analyze and evaluate the distributions of a dataset. Upload your whole package (including any codes, plots, and the report) to the blackboard by the due date.
The Data:
Collect the size of 1000 files on your computer. Convert all sizes to the same unit (e.g., either KB or MB). You can do a quick search online (or get help form your classmates) to find out how to collect file sizes automatically using windows dir command. You can also decide to manually record the file sizes if you cannot figure out how to do it automatically.
Dataset
Use the above dataset to plot the necessary graphs and answer the following questions.
- Extract the first digit of collected file sizes and collect them in a table. The first digit is defined as the first nonzero digit from the left. For example, 1294 has first digit 1, and 0.34 has first digit 3.
- Before ploting the distribution of these first digits, which probability distribution do you think would best represent it? Why?
- Now plot the probability distribution of these first digits (vertical axis represents the probability while the horizontal axis represents the digit 1, 2 , 3, 4, 5, 6, 7, 8, 9). Looking at the plot, how does the probability pattern change as you move from digit 1 to 9? (i.e., does it increase or decrease?)
- Now use the following function to calculate and plot the probability of digit !.
“(!) = log!” )1 + 1!, ,
! = 1, 2, 3, 4, 5, 6, 7, 8, 9
5. Merge the plots obtained in part 4 with part 3 so that they are shown side by side as clustered columns (see below for a visualization example, the probability values are just provided randomly in this figure and do not reflect any realistic case). Compare the two plots side by side. What is your observation? Do you see any relation among the two plots?
1 0.8 0.6 0.4 0.2 0 … 1 Plot from 3 Plot from 4 3 2
- Search for “Benford’s law” online and explain what it states.
- Using Benford’s law explain what you observe in the plot in part 5?
- Find an application of Benford’s law in real life by searching the internet. Explain why this law is useful for that specific application.