Describing Data II Lecture 2. February 15 th, 2010.

Презентация:



Advertisements
Похожие презентации
Chap 2-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 2 Describing Data: Graphical Statistics for Business and Economics.
Advertisements

Business Statistics 1-1 Chapter Two Describing Data: Frequency Distributions and Graphic Presentation GOALS When you have completed this chapter, you will.
© The McGraw-Hill Companies, Inc., Chapter 2 Frequency Distributions and Graphs.
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
Chap 2-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Theme 2 Describing Data: Graphical Statistics for Business and Economics.
Chap 11-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 11 Hypothesis Testing II Statistics for Business and Economics.
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 1-1 Chapter 1 Why Study Statistics? Statistics for Business and Economics.
Time-Series Analysis and Forecasting – Part IV To read at home.
Tool: Pareto Charts. The Pareto Principle This is also known as the "80/20 Rule". The rule states that about 80% of the problems are created by 20% of.
Time-Series Analysis and Forecasting Lecture on the 5 th of October.
Chap 7-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 7 Sampling and Sampling Distributions Statistics for Business.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Time-Series Analysis and Forecasting – Part II Lecture on the 5 th of October.
Chap 15-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 15 Nonparametric Statistics Statistics for Business and Economics.
Business Statistics 1-1 Chapter One What is Statistics? GOALS When you have completed this chapter, you will be able to: ONE Define what is meant by statistics.
© 2009 Avaya Inc. All rights reserved.1 Chapter Two, Voic Pro Components Module Two – Actions, Variables & Conditions.
© The McGraw-Hill Companies, Inc., Chapter 4 Counting Techniques.
The Law of Demand The work was done by Daria Beloglazova.
Sequences Sequences are patterns. Each pattern or number in a sequence is called a term. The number at the start is called the first term. The term-to-term.
Транксрипт:

Describing Data II Lecture 2. February 15 th, 2010

Your Task 4 Im not sure we can complete this presentation today. Your forth task is to look through the hole presentation at home and send me the list with numbers of slides unknown for you or difficult to understand. The direction – folder Task 4 at ier.mylivepage.ru

Chap 2-3 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Theme 2 Describing Data: Graphical Statistics for Business and Economics 6 th Edition

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-4 Theme Goals After completing this theme, you should be able to: Identify types of data and levels of measurement Create and interpret graphs to describe categorical variables: frequency distribution, bar chart, pie chart, Pareto diagram Create a line chart to describe time-series data Create and interpret graphs to describe numerical variables: frequency distribution, histogram, ogive, stem-and-leaf display Construct and interpret graphs to describe relationships between variables: Scatter plot, cross table Describe appropriate and inappropriate ways to display data graphically

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-5 Types of Data Data CategoricalNumerical DiscreteContinuous Examples: Marital Status Are you registered to vote? Eye Color (Defined categories or groups) Examples: Number of Children Defects per hour (Counted items) Examples: Weight Voltage (Measured characteristics)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-6 Measurement Levels Interval Data Ordinal Data Nominal Data Quantitative Data Qualitative Data Categories (no ordering or direction) Ordered Categories (rankings, order, or scaling) Differences between measurements but no true zero Ratio Data Differences between measurements, true zero exists

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-7 Raw data Suppose a researcher wished to make a research of the banks' operations. The researcher would first have to collect the reporting data. When data are collected in original form, the data are called raw data (Example 2.1 from Batueva)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-8 Table 5

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-9 Raw data Collecting data is the procedure of summarizing the data, which developed population for detection of character and regularity. The collected data are classified as simple and complicated. Simple collected develops the total. Complicated collected data is the procedure, which includes grouping of data, developing total for each class and for population, and presentation of results in tables and graphs

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-10 Graphical Presentation of Data Data in raw form are usually not easy to use for decision making Some type of organization is needed Table Graph The type of graph to use depends on the variable being summarized

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-11 Graphical Presentation of Data Techniques reviewed in this chapter: Categorical Variables Numerical Variables Frequency distribution Bar chart Pie chart Pareto diagram Line chart Frequency distribution Histogram and ogive Stem-and-leaf display Scatter plot (continued)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-12 Categorical FD Categorical frequency distributions are used for data, which can be placed in typological grouping and in specific categories such as classes of enterprises by types of ownership. Another example is grouping of people by blood type

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-13 Tables and Graphs for Categorical Variables Categorical Data Graphing Data Pie Chart Pareto Diagram Bar Chart Frequency Distribution Table Tabulating Data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-14 Example of Categorical FD Example blood-donors were given a blood-test to determine their blood type. The data set is as follows: А В В А АВ O О В АВ В В В О А O А О O О АВ АВ А О В А

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-15 Example 2.2 Blood type O corresponds to the first group, A is the second, B is the third, AB means the forth blood group. Let us construct a frequency distribution for the data. First I recommend you to arrange 25 results: O O O O O O O O A A A A A A B B B B B B B AB AB AB AB Next step is to make a table like Table 5

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-16 Table 6. The distribution of donors Blood types TallyNumber of donors, frequency f i Number of donors, % relative frequency w i O////////832 A//////624 B///////728 AB////416 Total25100

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-17 A lot of people have blood type "O". In addition we can calculate percentages, which are called relative frequencies. Relative frequency is calculated using the following formula:

18

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-19 Pie graph Percentages can be added since they are used in certain types of graphical presentation. We can use the pie graph (next slide) to present a number of people having different blood types.

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-20 Pie graph

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-21 The Frequency Distribution Table Table 7.Example: Hospital Patients by Unit Hospital Unit Number of Patients Cardiac Care 1,052 Emergency 2,245 Intensive Care 340 Maternity 552 Surgery 4,630 (Variables are categorical) Summarize data by category

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-22 Bar and Pie Charts Bar charts and Pie charts are often used for qualitative (category) data Height of bar or size of pie slice shows the frequency or percentage for each category

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-23 Bar Chart Example Hospital Number Unit of Patients Cardiac Care 1,052 Emergency 2,245 Intensive Care 340 Maternity 552 Surgery4,630

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-24 Pie Chart Example (Percentages are rounded to the nearest percent) Hospital Number % of Unit of Patients Total Cardiac Care 1, Emergency 2, Intensive Care Maternity Surgery 4,

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-25 Pareto Diagram Used to portray categorical data A bar chart, where categories are shown in descending order of frequency A cumulative polygon is often shown in the same graph Used to separate the vital few from the trivial many

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-26 Example: Table defective items are examined for cause of defect: Source of Manufacturing ErrorNumber of defects Bad Weld34 Poor Alignment223 Missing Part25 Paint Flaw78 Electrical Short19 Cracked case21 Total400 Pareto Diagram Example

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-27 Step 1: Sort by defect cause, in descending order Step 2: Determine % in each category Table 8a Source of Manufacturing ErrorNumber of defects% of Total Defects Poor Alignment Paint Flaw Bad Weld Missing Part Cracked case Electrical Short Total400100% Pareto Diagram Example (continued)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-28 Step 3: Determine cumulative % Table 8b Source… N of defects% of Total DefectsCumulative % Poor Alignment Paint Flaw Bad Weld Missing Part Cracked case Electrical Short Total400100% Pareto Diagram Example

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-29 Pareto Diagram Example cumulative % (line graph) % of defects in each category (bar graph) Step 4: Show results graphically

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-30 Graphs for Time-Series Data A line chart (time-series plot) is used to show the values of a variable over time Time is measured on the horizontal axis The variable of interest is measured on the vertical axis

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-31 Line Chart Example

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-32 Numerical Data Stem-and-Leaf Display HistogramOgive Frequency Distributions and Cumulative Distributions Graphs to Describe Numerical Variables

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-33 What is a Frequency Distribution? A frequency distribution is a list or a table … containing class groupings (categories or ranges within which the data fall)... and the corresponding frequencies with which data fall within each class or category Frequency Distributions

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-34 Why Use Frequency Distributions? A frequency distribution FD is a way to summarize data The distribution condenses the raw data into a more useful form... and allows for a quick visual interpretation of the data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-35 Steps of making FD 1.Determine or calculate the number of classes (groups) 2. Sort data in ascending or descending 3.Make intervals for numeric data 4.Count the number of units in each class (group) 5.Form the table of FD 6.Draw a graph illustrating the FD

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Determining the number of classes (groups) In case of categorical data the number of classes (groups) is equal to the number of categories organized Well have 2 classes in case of gender distribution and 378 groups of places of birth if our respondents declared 378 different birth places

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Determining the number of classes (groups) In case of numerical data the number of classes (groups) may be determined by the researcher in accordance with the objectives of investigation, his/her taste and common sense. It is recommended to use at least 5 but no more than groups

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Calculating the number of groups for numeric data Obviously the number of classes (groups) depends on the size of population or sample being organized. It is possible to calculate this number using the following formula proposed by H. A. Sturges in 1926: n = 1 + 3,322 lg N, where n – the number of groups (classes), N – the size of population (size of sampling)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-39 The number of groups according Sturges formula Table 9 N n

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-40 Intervals Each interval has its width, upper and lower limits (endpoints, boundaries) or at least one of them. Intervals may be equal and unequal by width, open (the first with upper endpoint and the last with lower endpoint) and close (with both endpoints). Unequal intervals are preferable in case of wide variance of data. They can be progressively ascending, progressively descending, arbitrary and specialized

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-41 Equal Intervals Lets construct grouping of banks by the size of owner's equity using the data from Table 5. First of all, we define the number of classes, using the Sturges formula: n = * lg N = * lg 20 = classes

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-42 Table 5

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-43 Equal Intervals The number of classes can be defined approximately based on the following statement: the number of classes should be less than population size by 4 times

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-44 Equal Intervals Further, we define the width (w) of classes, using the following equation: or where R - the range

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-45 Equal Intervals Some additional rules: Each class grouping has the same width Use at least 5 but no more than intervals Intervals never overlap Round up the interval width to get desirable interval endpoints Results of grouping are presented in the table 10:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-46 Table 10 Distribution of banks The size of owner's equity, USD mln Number of banks, units, f j ItemsSize of owner's equity per one bank ,10,12, ,5,6,8,9,13,17, ,7,11, , , Total20

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-47 Equal Intervals The first class has the lower boundary 12, because this value is minimal of the population. The width of class has been added to the lower boundary. It allows to define the upper boundary of the first class:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-48 Equal Intervals Since, the variable "owned capital" is continued and the lower boundary of the next class will be equal to the upper boundary of the previous class. But, if a variable is discrete, then to define the lower boundary of the next class it is necessary to add 1 to the upper boundary of the previous class. On the following step the average owned capital per one bank might be calculated:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-49 Equal Intervals This procedure is rather precise. The book we use as another reference Statistics for Business and Economics gives additional approach where the procedure of calculating interval parameters is not so strict

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-50 Frequency Distribution Example Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature Table 11 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-51 Sort raw data in ascending order: Table 11a 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find range: = 46 Select number of classes according to Sturges: n = 1 + 3,322 lg N = 1 + 3,322 lg 58 = 7 The authors of this textbook take 5 groups in this example. Lets follow their explanations Frequency Distribution Example (continued)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-52 Sort raw data in ascending order: Table 11a 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find range: = 46 Select number of classes: 5 (usually between 5 and 15) Compute interval width: 10 (46/5 then round up) Determine interval boundaries: 10 but less than 20, 20 but less than 30,..., 60 but less than 70 Count observations & assign to classes Frequency Distribution Example (continued)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-53 Frequency Distribution Example Interval Frequency 10 but less than but less than but less than but less than but less than Total Relative Frequency Percentage Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Table 12 (continued)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-54 Histogram A graph of the data in a frequency distribution is called a histogram The interval endpoints are shown on the horizontal axis the vertical axis is either frequency, relative frequency, or percentage Bars of the appropriate heights are used to represent the number of observations within each class

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-55 Temperature in Degrees Histogram Example (No gaps between bars) Interval 10 but less than but less than but less than but less than but less than 60 2 Frequency

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-56 Histograms in Excel Select Tools/Data Analysis 1

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-57 Choose Histogram 2 3 Input data range and bin range (bin range is a cell range containing the upper interval endpoints for each class grouping) Select Chart Output and click OK Histograms in Excel (continued) (

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-58 Unequal intervals The second approach. If units of the population are non-uniformly distributed, the unequal class width should be used. Unequal class width can be classified as monotone increasing and monotone reducing intervals

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-59 Unequal intervals If the number of units of the population decreases when the variable increases, the monotone increasing intervals should be used. If the number of units of the population increases when the variable decreases, the monotone reducing intervals should be used. For example:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-60 Table 13 Distribution of companies ClassThe size of fixed capital 1up to more than 1550

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-61 Unequal intervals The width of the next class could be twice as big as the width of the previous class. In this case the width of the next class is changed by a geometric series. The width of classes, changing by a geometric series, is calculated using the following equation: where q - constant, which could be more than 1, for monotone increasing intervals and less than 1, for monotone reducing intervals

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-62 Arbitrary intervals The third approach. For macroeconomic researches the grouping with arbitrary intervals should be used. In this case, class width could be defined using the coefficient of variation. Example: Below is data about import values with any European countries (billion USD):

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-63 Table 14 Value of imports between some EU countries, USD blns Source: Conditional data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-64 Arbitrary intervals First of all, these data must be put in the ascending order:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-65 Arbitrary intervals After that, it is necessary to define the spread (dispersion) between the two first values, using the coefficient of variation:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-66 Arbitrary intervals Thus, if the coefficient of variation is less than 33%, we have to combine three first values and repeat calculation. This procedure must be continued until the coefficient of variation is more than 33%. The coefficient of variation exceeded 33% after the value 3.9 was added. It means that the first class has lower boundary of 1.5 and the upper boundary of 3.5 (the value before 3.9)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-67 Arbitrary intervals The lower boundary of the second class equals 3.5, but the counting of the spread will start with 3.9. After that, the previous procedures should be repeated. After calculation of the coefficient of variation the following grouping was received (Table 15)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-68 Table 15 Distribution of the EU countries Import valuesNumber of countries Total20

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-69 Arbitrary intervals Grouped frequency distribution can be transformed into ungrouped frequency distribution. It is necessary to define the mid-point of each class. Lets return to the Table 10

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-70 Table 10 Distribution of banks The size of owner's equity, USD mln Number of banks, units, f i ItemsSize of owner's equity per one bank ,10,12, ,5,6,8,9,13,17, ,7,11, , , Total20

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-71 Arbitrary intervals The midpoint of each class should be defined: The modified Table 10 is given below:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-72 Table 16 Distribution of banks The size of owned capital, USD mln Number of banks, units, Size of owned capital per one bank Midpoint,Cumulative frequencies, Total20

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-73 Questions for Grouping Data into Intervals 1.How wide should each interval be? (How many classes should be used?) 2.How should the endpoints of the intervals be determined? Often answered by trial and error, subject to user judgment The goal is to create a distribution that is neither too "jagged" nor too "blocky Goal is to appropriately show the pattern of variation in the data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-74 How Many Class Intervals? Many (Narrow class intervals) may yield a very jagged distribution with gaps from empty classes Can give a poor indication of how frequency varies across classes Few (Wide class intervals) may compress variation too much and yield a blocky distribution can obscure important patterns of variation. (X axis labels are upper class endpoints)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-75 Graphs After data are organized in the form of frequency distribution we can display data with a graph. There are three most common graphical forms: 1. The histogram (bar graph) is used for grouped frequency distribution. 2. The frequency polygon is used for ungrouped frequency distribution. 3. The ogive

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-76 Histogram The histogram is a graph that displays the data by using vertical bars of various heights to represent the frequency

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-77 Histogram

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-78 Frequency Polygon The frequency polygon is a graph that displays the data by using lines that connect midpoints of each class

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-79 Frequency Polygon

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-80 Ogive The histogram and the frequency polygon are two different ways to represent the same data set. The choice of which one to use is left to the discretion of the researcher. Still another type of graph which can be used is called the cumulative frequency graph or ogive. The ogive is a graph that represents the cumulative frequencies for the classes in the frequency distribution. For this, the cumulate frequencies S i should be calculated (table 16)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-81 Ogive

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-82 Ogive Cumulative frequency graphs are used to represent visually how many values are below a certain upper class boundary. In the previous picture, the ogive shows how many banks have got the size of owned capital up to chosen value

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-83 The Cumulative Frequency Distribuiton Class 10 but less than but less than but less than but less than but less than Total Percentage Cumulative Percentage Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Frequency Cumulative Frequency

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-84 The Ogive Graphing Cumulative Frequencies Interval endpoints Interval Less than but less than but less than but less than but less than but less than Cumulative Percentage Upper interval endpoint

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-85 Distribution Shape The shape of the distribution is said to be symmetric if the observations are balanced, or evenly distributed, about the center.

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-86 Distribution Shape The shape of the distribution is said to be skewed if the observations are not symmetrically distributed around the center. (continued) A positively skewed distribution (skewed to the right) has a tail that extends to the right in the direction of positive values. A negatively skewed distribution (skewed to the left) has a tail that extends to the left in the direction of negative values.

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-87 Stem-and-Leaf Diagram A simple way to see distribution details in a data set METHOD: Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-88 Example Here, use the 10s digit for the stem unit: Data in ordered array: 21, 24, 24, 26, 27, 27, 30, 32, 38, is shown as 38 is shown as Stem Leaf

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-89 Example Completed stem-and-leaf diagram: StemLeaves (continued) Data in ordered array: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-90 Using other stem units Using the 100s digit as the stem: Round off the 10s digit to form the leaves 613 would become would become becomes 12 2 Stem Leaf

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-91 Using other stem units Using the 100s digit as the stem: The completed stem-and-leaf display: Stem Leaves (continued) Data: 613, 632, 658, 717, 722, 750, 776, 827, 841, 859, 863, 891, 894, 906, 928, 933, 955, 982, 1034, 1047,1056, 1140, 1169, 1224

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-92 Relationships Between Variables Graphs illustrated so far have involved only a single variable When two variables exist other techniques are used: Categorical (Qualitative) Variables Numerical (Quantitative) Variables Cross tables Scatter plots

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-93 Scatter Diagrams are used for paired observations taken from two numerical variables The Scatter Diagram: one variable is measured on the vertical axis and the other variable is measured on the horizontal axis Scatter Diagrams

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-94 Scatter Diagram Example Volume per day Cost per day

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-95 Scatter Diagrams in Excel Select the chart wizard 1 2 Select XY(Scatter) option, then click Next When prompted, enter the data range, desired legend, and desired destination to complete the scatter diagram 3

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-96 Cross Tables Cross Tables (or contingency tables) list the number of observations for every combination of values for two categorical or ordinal variables If there are r categories for the first variable (rows) and c categories for the second variable (columns), the table is called an r x c cross table

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-97 Cross Table Example 4 x 3 Cross Table for Investment Choices by Investor (values in $1000s) Investment Investor A Investor B Investor C Total Category Stocks Bonds CD Savings Total

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-98 Side by side bar charts (continued) Graphing Multivariate Categorical Data

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap 2-99 Side-by-Side Chart Example Sales by quarter for three sales territories:

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Data Presentation Errors Goals for effective data presentation : Present data to display essential information Communicate complex ideas clearly and accurately Avoid distortion that might convey the wrong message

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Unequal histogram interval widths Compressing or distorting the vertical axis Providing no zero point on the vertical axis Failing to provide a relative basis in comparing data between groups Data Presentation Errors (continued)

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Secondary Classification Secondary classification is regrouping of results of a primary grouping. There are two ways of regrouping: 1) consolidation of classes width, when two class widths could be added up; 2) dissection of class widths, when each class could be shared on

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Home Task Your Home Task for next two weeks is 1.Read about Secondary classification in textbooks: Batueva pp , My Russian presentation Сводка и группировка. Часть II at oknedis.narod.ru My textbook Теория статистики p Ex. 19 (30 points),22 (10), 26 (15) – Сборник задач по общей теории статистики Your maximum mark for this Task is 65 points for the first student who sends the task to my , 64 for the second etc. The deadline is the 15 th of March, 9-00 a.m.

Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chap Chapter Summary Reviewed types of data and measurement levels Data in raw form are usually not easy to use for decision making -- Some type of organization is needed: Table Graph Techniques reviewed in this chapter : Frequency distribution Bar chart Pie chart Pareto diagram Line chart Frequency distribution Histogram and ogive Stem-and-leaf display Scatter plot Cross tables and side-by-side bar charts

THE END