




Introduction to Biostatistics 

statistics are simply a collection of tools that researchers employ to help answer research questions


INTRODUCTION

Statistics plays a vitally important role in the research.

Health information is very often explained in statistical terms

Many decisions in the Health Sciences are created through statistical studies

It enables you:

to read and evaluate reports and other literature

to take independent research investigations

to describe the data in meaningful terms
DEFINITIONS

Statistics: is the study of how to collect, organizes, analyze, and interpret data.

Data: the values recorded in an experiment or observation.

Population: refers to any collection of individual items or units that are the subject of investigation.

Sample: A small representative sample of a population is called sample.

Observation: each unit in the sample provides a record, as a measurement which is called observation.

Sampling: getting sample from a population

Variable: the value of an item or individual is called variable

Raw Data: Data collected in original form.

Frequency: The number of times a certain value or class of values occurs.

Tabulation: can be defined as the logical and systematic arrangement of statistical data in rows and columns.

Frequency Distribution: The organization of raw data in table form with classes and frequencies.

Class Limits: Separate one class in a grouped frequency distribution from another. The limits could actually appear in the data and have gaps between the upper limit of one class and the lower limit of the next.

Class Boundaries: Separate one class in a grouped frequency distribution from another.

Cumulative Frequency: The number of values less than the upper class boundary for the current class. This is a running total of the frequencies.

Histogram: A graph which displays the data by using vertical bars of various heights to represent frequencies.

Frequency Polygon: it is a line graph. The frequency is placed along the vertical axis and the class midpoints are placed along the horizontal axis. These points are connected with lines.

Pie Chart: Graphical depiction of data as slices of a pie. The frequency determines the size of the slice. The number of degrees in any slice is the relative frequency times 360 degrees.

Central tendency  a typical or representative value for a dataset.
VARIABLES
 The value of an item or individual is called variable.
 Variables are of two types:
 Quantitative: a variable with a numeric value. E.g. age, weight.
 Qualitative: a variable with a category or group value. E.g. Gender (M/F), Religion (H/M/C), Qualification (degree/PG)
 Quantitative variable are two types:
 Discrete /categorical variables
 Continuous variables
 Variables can be
 Independent
 Are not influenced by other variables.
 Are not influenced by the event, but could influence the event.
 Dependent
 The variable which is influenced by the others is often referred as dependent variable.
E.g. In an experimental study on relaxation intervention for reducing HTN, blood pressure is the dependent variable and relaxation training, age and gender are independent variable.
SAMPLING

In Simple Random sampling, each individual of the population has an equal chance of being included in the sample. Two methods are used in simple random sampling:

Random Numbers method

Lottery method

In stratified random sampling, the population is divide in to groups or strata on the basis of certain characteristics.

In cluster sampling, the whole population is divided in to a number of relatively small cluster groups. Then some of the clusters are randomly selected.

Convenience sampling is a type of nonprobability sampling which involves the sample being drawn from that part of the population which is selected because it is readily available and convenient.

Purposive sampling is a type of nonprobability sampling in which researcher selects participants based on fulfillment of some criteria. E.g. schizophrenia treatment naive.
SCALES OF MEASUREMENT

Four measurement scales are used: nominal, ordinal, interval and ratio.

Each level has its own rules and restrictions.
Nominal Scale of measurement

Nominal variables include categories of people, events, and other phenomena are named.

Example: gender, ageclass, religion, type of disease, blood groups A, B, AB, and O.

They are exhaustive in nature, and are mutually exclusive.

These categories are discrete and noncontinuous.

Statistical operations permissible are: counting of frequency, Percentage, Proportion, mode, and coefficient of contingency.
Ordinal Scale of measurement

It is second in terms of its refinement as a means of classifying information.

It incorporates the functions of nominal scale.

The ordinal scale is used to arrange (or rank) individuals into a sequence ranging from the highest to lowest.

Ordinal implies rankordered from highest to lowest.

Grade A+, A, B+, B, C+, C

1st , 2nd , 3rd etc
Interval scale of Measurement

Interval scale refers to the third level of measurement in relation to complexity of statistical techniques used to analyze data.

It is quantitative in nature

The individual units are equidistant from one point to the other.

The interval data does not have an absolute zero.
Ratio Scale of Measurement

Equal distances between the increments

This scale has an absolute zero.

Ratio variables exhibit the characteristics of ordinal and interval measurement
[The mathematical properties of interval and ratio scales are very similar, so the statistical procedures are common for both the scales.] 




