### 统计代写|应用统计代写applied statistics代考|Exploratory Data Analysis and Data Summarization

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|应用统计代写applied statistics代考|Exploratory Data Analysis and Data Summarization

The purpose of this chapter is to introduce you to working with and manipulating data in R, exploring data, and plotting. I strongly believe in learning by doing, so let’s start doing some things so you can start learning!

In this book, we will use data from an experiment I conducted when I was a postdoc at the Smithsonian Tropical Research Institute in Panama in 2010 , and which was published in the journal Ecology in 2013 (https:// www.jstor.org/stable/23436298). The experiment was part of a National Science Foundation (NSF) funded project to Drs. Karen Warkentin (Boston University) and James Vonesh (Virginia Commonwealth University) studying the effects of flexible hatching timing by red-eyed treefrog (Agalychnis callidryas) embryos on interactions with predators and food levels and subsequent phenotype development of tadpoles. In order to follow along with the examples in this chapter, and the rest of the

book, you should download from the Github page for this book (https:// github.com/jtouchon/Applied-Statistics-with-R) a .csv file titled “RxP.csv”” The data are called by the short name “RxP” which stands for “Resourceby-Predation, which was the nature of the experiment (we were studying the interaction of resources and predators). This brings up a chance to reiterate a small but important point: since $\mathrm{R}$ is entirely based on typing commands by hand, you should give your datasets and variables short names so that they are quick and easy to type.
First, let’s get a handle on what the data are.

## 统计代写|应用统计代写applied statistics代考|READING IN THE DATA FILE

If you are loading a data file from somewhere on your computer, you will read it into the active workspace with the command read.csv(). If your dataset has categorical variables, you will want to include the argument stringsAsFactors= $T$, which tells the function to automatically make any columns that have character data in them factors. If the data are in your working directory, you would simply do the following (and remember, you should be working in a script window!). Note that you have to assign the data to an object. What happens if you do not? What is the working directory you ask? It is the directory that $R$ will look in by default. To find out where $R$ is looking, type getwd( $)$ at the prompt.

You can set your working directory with the function setwd( $)$, where you would put in the parenthesis a path to a folder on your computer. Make sure to put the path in quotes. Some folks like to set a working directory for each project they have. Others prefer to keep the working directly in a single place and just code the path to certain files. Choose whichever works for you. If your data are not in your working directory, you will need to specify exactly where to find the file on your computer.

## 统计代写|应用统计代写applied statistics代考|DATA EXPLORATION AND ERROR CHECKING

Whenever you start working with a dataset in $\mathrm{R}$, you should first devote substantial time to checking it for errors. Questions you should ask yourself include:

• Did the data import correctly?
• Are the column names correct?
• Are the types of data appropriate? (e.g., factor vs numerical)
• Are the numbers of columns and rows appropriate?
• Are there typos?
If, for example, a column that is supposed to be numerical shows up as a factor, that likely indicates a typo where you accidentally have text in place of a number (remember, each column in a data frame is a vector, and vectors can only have one mode, so a vector with both numbers and characters is treated as if it is all characters). Similarly, if you have a factor that should have 3 categories, but imports with 4 , you likely have a typo (e.g., “predator” vs “predtaor”), and the misspelled version is showing up as a separate category. These sorts of mistakes are very common!

Because this dataset has been thoroughly examined (very thoroughly!), these types of errors are not present. However, you might want to change the names of columns or remove outliers, which we will cover in the subsequent sections.

## 统计代写|应用统计代写applied statistics代考|DATA EXPLORATION AND ERROR CHECKING

• 数据是否正确导入？
• 列名是否正确？
• 数据类型是否合适？（例如，因子与数值）
• 列数和行数是否合适？
• 有错别字吗？
例如，如果一个应该是数字的列显示为一个因素，这可能表示您不小心用文本代替数字的错字（请记住，数据框中的每一列都是一个向量，向量可以只有一种模式，因此同时包含数字和字符的向量被视为全部字符）。类似地，如果您有一个应该有 3 个类别的因子，但使用 4 导入，则您可能有拼写错误（例如，“predator”与“predtaor”），并且拼写错误的版本显示为单独的类别。这类错误很常见！

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。