STUDENTS PLEASE READ: Statistics is concerned with exploring, - TopicsExpress



          

STUDENTS PLEASE READ: Statistics is concerned with exploring, summarising, and making inferences about the state of complex systems. As summarised in Table 1.1, the development of statistics in Europe was strongly motivated by the need to make sense of the large amount of data collected by population surveys in the emerging nation states. At the same time, the mathematical foundations for statistics advanced significantly due to breakthroughs in probability theory inspired by games of chance (gambling). For more information about the history of statistics refer to the books by Johnson and Kotz (1998) and Kotz and Johnson (1993). History of statistics The History of statistics can be said to start around 1749 although, over time, there have been changes to the interpretation of the word statistics. In early times, the meaning was restricted to information about states. This was later extended to include all collections of information of all types, and later still it was extended to include the analysis and interpretation of such data. In modern terms, statistics means both sets of collected information, as in national accounts and temperature records, and analytical work which requires statistical inference. Statistical activities are often associated with models expressed using probabilities, and require probability theory for them to be put on a firm theoretical basis: see History of probability. A number of statistical concepts have had an important impact on a wide range of sciences. These include the design of experiments and approaches to statistical inference such asBayesian inference, each of which can be considered to have their own sequence in the development of the ideas underlying modern statistics. By the 18th century, the term statistics designated the systematic collection ofdemographic and economic data by states. In the early 19th century, the meaning of statistics broadened to include the discipline concerned with the collection, summary, and analysis of data. Today statistics is widely employed in government, business, and all the sciences. Electronic computers have expedited statistical computation, and have allowed statisticians to develop computer-intensive methods. The term mathematical statistics designates the mathematical theories of probability andstatistical inference, which are used in statistical practice. The relation between statistics and probability theory developed rather late, however. In the 19th century, statistics increasingly used probability theory, whose initial results were found in the 17th and 18th centuries, particularly in the analysis of games of chance (gambling). By 1800, astronomy used probability models and statistical theories, particularly the method of least squares. Early probability theory and statistics was systematized in the 19th century and statistical reasoning and probability models were used by social scientists to advance the new sciences of experimental psychology and sociology, and by physical scientists inthermodynamics and statistical mechanics. The development of statistical reasoning was closely associated with the development of inductive logic and the scientific method. Statistics can be regarded as not a field of mathematics but an autonomous mathematical science, like computer science and operations research. Unlike mathematics, statistics had its origins in public administration. It is used in demography and economics. With its emphasis on learning from data and making best predictions, statistics has a considerable overlap with decision science and microeconomics. With its concerns with data, statistics has overlap with information science and computer science. Origins in probability theory Basic forms of statistics have been used since the beginning of civilization. Early empires often collated censuses of the population or recorded the trade in various commodities. The Roman Empire was one of the first states to extensively gather data on the size of the empires population, geographical area and wealth. The arithmetic mean, although a concept known to the Greeks, was not generalised to more than two values until the 16th century. The invention of the decimal system by Simon Stevin in 1585 seems likely to have facilitated these calculations. This method was first adopted in astronomy by Tycho Brahe who was attempting to reduce the errors in his estimates of the locations of various celestial bodies. The idea of the median originated in Edward Wrights book on navigation (Certaine Errors in Navigation) in 1599 in a section concerning the determination of location with a compass. Wright felt that this value was the most likely to be the correct value in a series of observations. Sir William Petty, a 17th-century economist who used early statistical methods to analyse demographic data. The birth of statistics is often dated to 1662, when John Graunt, along with William Petty, developed early human statistical and census methods that provided a framework for moderndemography. He produced the first life table, giving probabilities of survival to each age. His book Natural and Political Observations Made upon the Bills of Mortality used analysis of themortality rolls to make the first statistically based estimation of the population of London. He knew that there were around 13,000 funerals per year in London and that three people died per eleven families per year. He estimated from the parish records that the average family size was 8 and calculated that the population of London was about 384,000. Although the original scope of statistics was limited to data useful for governance, the approach was extended to many fields of a scientific or commercial nature during the 19th century. The mathematical foundations for the subject heavily drew on the new probability theory, pioneered in the 17th century in the correspondence between Pierre de Fermat and Blaise Pascal.Christiaan Huygens (1657) gave the earliest known scientific treatment of the subject. Jakob Bernoullis Ars Conjectandi (posthumous, 1713) and Abraham de Moivres The Doctrine of Chances (1718) treated the subject as a branch of mathematics. In his book Bernoulli introduced the idea of representing complete certainty as one and probability as a number between zero and one. The formal study of theory of errors may be traced back to Roger Cotes Opera Miscellanea(posthumous, 1722), but a memoir prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of observation. The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally probable, and that there are certain assignable limits within which all errors may be supposed to fall; continuous errors are discussed and a probability curve is given. Simpson discussed several possible distributions of error. He first considered the uniform distribution and then the discrete symmetric triangular distribution followed by the continuous symmetric triangle distribution. Tobias Mayer, in his study of the libration of the moon (Kosmographische Nachrichten, Nuremberg, 1750), invented the first formal method for estimating the unknown quantities by generalized the averaging of observations under identical circumstances to the averaging of groups of similar equations. Ruder Boškovic in 1755 based in his work on the shape of the earth proposed in his book De Litteraria expeditione per pontificiam ditionem ad dimetiendos duos meridiani gradus a PP. Maire et Boscovicli that the true value of a series of observations would be that which minimises the sum of absolute errors. In modern terminology this value is the median. The first example of what later became known as the normal curve was studied by Abraham de Moivre who plotted this curve on November 12, 1733.[2] de Moivre was studying the number of heads that occurred when a fair coin was tossed. TWO TYPES OF STATISTICS: There are two major divisions of the field of statistics. Each of these segments of statistics is important, and accomplishes different objectives. The names of these subfields are descriptive and inferential statistics. What are the differences between these areas of statistics? Descriptive Statistics Descriptive statistics is the type of statistics that probably springs to most people’s minds when they hear the word “statistics.” Here the goal is to describe. Numerical measures are used to tell about features of a set of data. There are a number of items that belong in this portion of statistics, such as: • The average, or measure of center, consisting of the mean, median, mode or midrange. • The spread of a data set, which can be measured with the range or standard deviation. • Overall descriptions of data such as the five number summary. • Other measurements such as skewness andkurtosis. • The exploration of relationships and correlationbetween paired data. • The presentation of statistical results in graphicalform. Inferential Statistics For the area of inferential statistics we begin by differentiating between two groups. The population is the entire collection of individuals that we are interested in studying. It is typically impossible or infeasible to examine each member of the population individually. So we choose a representative subset of the population, called a sample. Inferential statistics studies a statistical sample, and from this analysis is able to say something about the population from which the sample came. There are two major divisions of inferential statistics: • A confidence interval gives a range of values for an unknown parameter of the population by measuring a statistical sample. This is expressed in terms of an interval and the degree of confidence that the parameter is within the interval. • Tests of significance or hypothesis testing tests a claim about the population by analyzing a statistical sample. By design there is some uncertainty in this process. This can be expressed in terms of a level of significance. Difference Between These Areas As seen above, descriptive statistics is concerned with telling about certain features of a data set. Although this is helpful in learning things such as the spread and center of the data we are studying, nothing in the area of descriptive statistics can be used to make any sort of generalization. In descriptive statistics measurements such as the mean and standard deviation are stated as exact numbers. Though we may use descriptive statistics all we would like in examining a statistical sample, this branch of statistics does not allow us to say anything about the population. Inferential statistics is different from descriptive statistics in many ways. Even though there are similar calculations, such as those for the mean and standard deviation, the focus is different for inferential statistics. Inferential statistics does start with a sample and then generalizes to a population. This information about a population is not stated as a number. Instead we express these parameters as a range of potential numbers, along with a degree of confidence. It is important to know the difference between descriptive and inferential statistics. This knowledge is helpful when we need to apply it to a real world situation involving statistical methods. SAMPLE AND POPULATION: The major use of inferential statistics is to use information from a sample to infer something about a population. A population is a collection of data whose properties are analyzed. The population is the complete collection to be studied, it contains all subjects of interest. A sample is a part of the population of interest, a sub-collection selected from a population. A parameter is a numerical measurement that describes a characteristic of a population, while a sample is a numerical measurement that describes a characteristic of a sample. In general, we will use a statistic to infer something about a parameter.
Posted on: Sun, 19 Jan 2014 10:42:36 +0000

Trending Topics



Recently Viewed Topics




© 2015