R Programming

 

                                      STATISTICAL METHODS LAB    (R PROGRAMMING)

 

Syllabus

 

1

Familiarization of R environment and R Studio. Installing and using packages

2

Practice basic R input/output commands and create simple R programs using variables

/Mathematical operations

3

Learn control statements in R, (if, switch, for, while, repeat, break, next)

4

Write R programs using functions (Functions, Recursive Functions)

5

Learn to use Data Structures in R (strings, vectors, lists, matrix, arrays, dataframes, factors)

6

Plotting in R (line graph, scatter plots, bar plots, pie charts, histogram, box plots, strip charts)

7

Data Manipulation using R (R data sets, basic summary statistics, reading/writing csv and excel files)

8

Measures of variability and correlation/covariance in R (range, variance, standard deviation, covariance/correlation)

9

Plotting of Probability Distribution Using R Functions (normal, binomial, poisson)

10

Hypothesis testing using R ( t-test, chi square test, Wilcoxon Signed Rank Test)

11

Regression in R ( linear, multiple, logistic)

12

Time series Analysis in R( ARIMA)

 

Lab Cycle

Lab 1: Introduction to R Programming (Date:     )

·       Introduction to R environment and RStudio.

·       Basics of R syntax: variables, data types, arithmetic operations.

·       Writing and executing simple R scripts.

·       Introduction to R packages and libraries.

Lab 2: Basics programs in R(Date:     )

·    Write an R script to understand the basic data types and variable initialization.

·    Read two numbers and do the arithmetic operations.

 

Lab 3: Control Structures ( if -else statements, for, while) (Date:     )

- Introduction to control structures: if-else statements, loops (for, while).

·    Check whether the number is positive negative or zero (if-else)

·    Read a month number and Print the month name (switch)

·    Find the factors of a number (for)

·    Find the sum of digits of a number (while, repeat)

 

Lab 4: Functions and nested loops(Date:     )

·     Print all prime numbers less than 1000.(nested loops)

·    Write a R function to find the area and perimeter of a rectangle.(function)

·    Write a recursive factorial function and compute nCr. (Recursive function)

 

Lab 5: Data Frame, List and Matrix(Date:     )

·  Create a dataframe with following data and do the

·       operations(add/remove/summary etc…).Write the dataframe into a csv file.

Name  Language Age

1

Amiya  R

22

2

Raj     Python

25

3

Asish   Java

45

 

·  Find the average of set of numbers ( list).

·  Write a program to create two matrix and perform various operations. (matrix)

 

Lab 6: Data Visualization with scatterplot, pie chart, line graph(Date: )

·       Subject code and marks of 2 students are stored in vectors. Draw a line graph

·       Age and speed of 10 cars are stored in two different vectors .Do a scatter plot.

·       Create a vector representing percentage of grades of students in a class. Plot a pie chart

·     The areas of the various continents of the world (in millions of square miles) are as follows:11.7 for Africa; 10.4 for Asia; 1.9 for Europe; 9.4 for North America; 3.3 Oceania; 6.9 South America; 7.9 Soviet Union. Draw a bar chart representing the given data.

 

 

Lab 7: Data Visualization with scatterplot, pie chart, line graph(Date   )

·       Subject code and marks of 2 students are stored in vectors. Draw a line graph

·       Age and speed of 10 cars are stored in two different vectors .Do a scatter plot.

·       Create a vector representing percentage of grades of students in a class. Plot a pie chart

·     The areas of the various continents of the world (in millions of square miles) are as follows:11.7 for Africa; 10.4 for Asia; 1.9 for Europe; 9.4 for North America; 3.3 Oceania; 6.9 South America; 7.9 Soviet Union. Draw a bar chart representing the given data.


 

Lab 7: Histogram , mean, median and variance(Date:     )

·       Draw the histogram of the following data:

 

Height of student(m)

135 - 140

140 - 145

145 - 150

150 - 155

No. of students

4

12

16

8

·        Table contains population and murder rates (in units of murders per 100,000 people per year) for different states. Compute the mean, median and variance for the population.

 

State

Population

Murder

Alabama

4,779,736

5.7

Alaska

710,231

5.6

Arizona

6,392,017

4.7

Arkansas

2,915,918

5.6

California

37,253,956

4.4

Colorado

5,029,196

2.8

Connecticut

3,574,097

2.4

Delaware

897,934

5.8

·        Use the R built-in dataset airquality which has "Daily air quality measurements in New York, May to September 1973."-R documentation. Create a box plot.

·       Create a strip chart for the Ozone reading of the “airquality” dataset.

 

Lab 8: Writing Efficient R Code(Date:     )

·      Find the statistical summary of Temp in the “airquality” dataset in R.

·       Given two data sets as vectors .Find the correlation coefficient ( spearman)

·        Plot the normal distribution and cumulative distribution curve with given mean and standard deviation.

·         Consider mtcars dataset in R.For our model we will consider the variables "AirBags" and "Type". Here we aim to find out any significant correlation between the types of car sold and the type of Air bags it has. If correlation is observed we can estimate which types of cars can sell better with what types of air bags. (use chi-square test)

 

Lab 9:

·         A data set containing the weight of 10 rabbits. Use Wilcoxon Test to know if the median weight of the rabbit differs from 25g? greater than 25g and below 25g ?

name weight ( generate data randomly)

 

1  R_1

27.6

2  R_2

30.6

3  R_3

32.2

4  R_4

25.3



5  R_5

30.9

6  R_6

31.0

7  R_7

28.9

8  R_8

28.9

9  R_9

28.9

10 R_10  28.2

·       Below is the sample data representing the observations

# Values of height 151, 174, 138, 186, 128, 136, 179, 163, 152, 131

# Values of weight. 63, 81, 56, 91, 47, 57, 76, 72, 62, 48

Predict the Weight of a person with height 170cm. Use regression in R Plot the data and regression line graphically.

·        Consider the data set "mtcars" available in the R environment. It gives a comparison between different car models in terms of mileage per gallon (mpg), cylinder displacement("disp"), horse power("hp"), weight of the car("wt") and some more parameters. The goal of the model is to establish the relationship between "mpg" as a response variable with "disp","hp" and "wt" as predictor variables ( Use multiple Regression)

 

Lab 10: ARIMA (Date:     )

·        Predict the next 10 sale values by using BJsales dataset present in R package “forecast” using ARIMA model. Plot the graph showing the forecast. (Install and use the required package)

 

 

Practice Questions

 

·        The in-built data set in R "mtcars" describes different models of a car with their various engine specifications. In "mtcars" data set, the transmission mode (automatic or manual) is described by the column "am" which is a binary value (0 or 1). Create a logistic regression model between the columns "am" and 3 other columns - hp, wt and cyl. Find the significance

·        Consider the annual rainfall details at a place starting from January 2012. Create an R time series object for a period of 12 months and plot it.

#rain fall--799,1174.8,865.1,1334.6,635.4,918.5,685.5,998.6,784.2,985,882.8,1071

 

Comments

Popular posts from this blog