统计代写|应用统计代写applied statistics代考|More Linear Models in R

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|More Linear Models

The purpose of this chapter is to expand on what you learned in Chapter 5 , introducing you to the multitude of linear models you can do in $R$. This chapter assumes you are using the RxP.byTank dataset that was created in Chapter $5 .$
This chapter will cover the following topics:

1. Analysis of Variance (ANOVA) with more than one predictor
2. Linear regression
3. Analysis of Covariance (ANCOVA)
4. Plotting regressions
5. Predicting values from a model
As a reminder, we are only concerned only with “normal” data for the moment. By normal data, we are referring to data that fit a normal distribution, with an approximately even distribution of values around the mean. Table $6.1$ shows the variety of models that we can conduct under the $\operatorname{lm}()$ framework. Note that all we do is change the number and type of predictor variables to do different types of models.

统计代写|应用统计代写applied statistics代考|GETTING STARTED

The model commonly referred to as a linear model, or $\operatorname{lm}()$, is one of the most flexible and useful in all of statistics. We will discuss generalized linear models (GLMs) in Chapter 7 , which allow us to analyze non-normally distributed data. Before we get started, let’s load all the packages you will need here. This is something I like to do at the head of any script file. Remember that if you don’t have one of these packages already, you can download it by using the Package Installer menu, or simply by typing install.packages() at the prompt and put the name of the package in quotes inside the parentheses. If you use the menu, make sure you click the button to “include dependencies.” This is important since most packages rely on other packages to run.

This shows that we have very strong effects of both resource and predator treatments and a significant interaction between the two (albeit not a particularly strong one). In other words, predators alter the age at metamorphosis and so do the different resource levels. However, more importantly our model says that the effect of resources on age at metamorphosis depends on the presence of predators. We could also reframe this in the other direction; the effect of predators on age at metamorphosis depends on the amount of food tadpoles are fed.

统计代写|应用统计代写applied statistics代考|MULTI-WAY ANALYSIS OF VARIANCE-ANOVA

The linear model we ran Chapter 5 (remember, it was called “lm1”) had a single categorical predictor. This type of model is called a one-way Analysis of Variance (ANOVA). But what about when we have multiple categorical predictors that may interact with one another? This is generally referred to as a multi-way ANOVA-a model with two predictors would be a two-way ANOVA, three predictors would be a three-way ANOVA, and so on. Okay, but what does it mean to “interact?”

For the RxP data, it is very reasonable to expect that the effects of predators on age at metamorphosis or mass at metamorphosis (or any other response variable) might differ under different resource conditions. To code an interaction between two effects in R you use a colon (:). However, $R$ also provides us with a shortcut for writing both individual effects (e.g., Res or Pred) and the interaction between them (e.g., Res:Pred). Using the asterisk ( $\left.{ }^{}\right)$ tells $\mathrm{R}$ to look at both individual and interaction effects between whatever predictors you have provided. Thus, “Res Pred” is the same as writing “Res+Pred+Res:Pred.” This model is therefore looking at the effects of just resources, of just predators, and then of a possible interaction between the two. Here, we will switch from using Age.FromEmergence

(which is what we examined in Chapter 5) to Age.DPO, which is more intuitive since it tells us the age when froglets metamorphosed in terms of when they were laid as eggs.

Let’s begin with a two-way ANOVA examining the interacting effects of resources and predators on the log-transformed age at metamorphosis.

统计代写|应用统计代写applied statistics代考|More Linear Models

1. 具有多个预测变量的方差分析 (ANOVA)
2. 线性回归
3. 协方差分析 (ANCOVA)
4. 绘制回归
5. 从模型中预测值
提醒一下，我们目前只关心“正常”数据。通过正态数据，我们指的是符合正态分布的数据，其值在均值附近分布大致均匀。桌子6.1显示了我们可以在流明⁡()框架。请注意，我们所做的只是更改预测变量的数量和类型以执行不同类型的模型。

统计代写|应用统计代写applied statistics代考|MULTI-WAY ANALYSIS OF VARIANCE-ANOVA

（这是我们在第 5 章中研究过的）到 Age.DPO，它更直观，因为它告诉我们小蛙从产卵时变态的年龄。

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|Basic Statistical Analyses using R

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|Basic Statistical Analyses using R

The purpose of this chapter is to introduce you to some basic statistical analyses using $R$. This chapter assumes you are using the RxP.clean dataset that was created in the Chapter 3 .
We will cover the following topics:

1. Assessing data normality
2. Some basic non-parametric statistics
3. Student’s t-test
4. One-way Analysis of Variance (ANOVA)
5. We have now seen how to use a function like qplot() to look at your data in various ways. For example, you can plot a histogram of your response variable and see how it is distributed. However, as you move beyond the initial steps of data exploration and start to think about data analysis, there are several questions you should ask yourself. The most important of which is just what kind of data do you have?

This might seem like a simple question at first, but it is paramount for determining the analyses you will conduct. When we talk about data, are we talking about your response or your predictor variables? The answer is both of course. Knowing the shape of your response and predictor variables will determine what sort of analysis you do. In addition to simply knowing if the data are normal or not, you should be mindful of if you have one predictor or multiple predictors, and if your predictors are continuous (a bunch of numbers) or discrete (different categories). For this chapter and the next, we will just concern ourselves with analyzing data where the response variable is normally distributed.

统计代写|应用统计代写applied statistics代考|AVOIDING PSEUDOREPLICATION

Although we have data on nearly 2500 individual metamorphs, those data are not all independent from one another. This is because groups of tadpoles were raised in common environments (the mesocosms, aka tanks) and variation between tanks may make certain individuals more similar than others. If we treat all individuals as independent, we are committing pseudoreplication, which is when you artificially inflate your sample size of independent observations. This was first mentioned in Chapter 2 , but for more details, see the classic article Hurlbert, S.H. 1984. “Pseudoreplication and the design of ecological field experiments.” Ecological Monographs $54(2): 187-211$. It’s a little long, but it’s a great read!

One easy way to avoid pseudoreplication is to utilize the mean value for each tank instead of that of individuals. We can summarize our entire dataset with relative ease using the summarize() function that was introduced in Chapter 3. We will have to get rid of column 1 on Individual ID and column 3 which shows the tank within each block because those would not be meaningful to average at the tank level. Recall that in order to use summarize() we first: 1) define our dataset, then 2) define the different variables we want to group our data by with the group_by() function, and lastly 3) use summarize() to create the variables that are the mean values from the raw dataset. At each of the three steps you pipe your data from one line to the next. You could also use the aggregate() function by defining the 7 columns to summarize (our response variables) and binding them together in a single object using the function cbind(), which binds two or more columns together, then give it the various predictor variables we want to use to summarize the data. Here, we will use summarize(), which just makes more sense. Recall that these functions are found in the dplyr package.

One other thing to notice in the following code is that there are more variables than necessary in the group_by() function. All we need to uniquely identify each tank is the variable “Tank.Unique” or the three treatments in combination. Including all of them, as well as the “Block” variable, we will merely carry more columns through to our new summarized dataset. Lastly, notice that we are storing the output of this set of code as a new object called “RxP.byTank.”

统计代写|应用统计代写applied statistics代考|Looking at the data

The first and easiest thing to do if you want to see if your data are normal or not is to plot the histogram of values. You could use either the hist() function or the $q$ plot() function found in the ggplot2 package. Either way, a histogram should give you a general sense of what your data look like. Remember that data which are normally distributed will have a relatively even spread of values above and below the mean, like that shown in Figure 5.1. Let’s look at the variables “Age.FromEmergence” and “SVL.final” and plot them using qplot() from the ggplot2 package. Remember, you have to load the package first using the library() function. Notice in the following code that we can use the “bins=” argument to specify how fine we want the histogram to break up the data. Also notice I’ve loaded the package cowplot which contains the function plot_grid() which allows us to plot multiple different figures together in a single window.

Looking at these two histograms (Figure $5.2$ ) is quite informative. The SVL data appear almost normal, although the data have a slight tail to the right. The Age data are, however, very skewed; most individuals emerged very early in the period of metamorphosis, but the tail is very long with individuals still emerging from tanks more than 100 days after the first metamorphs. Many biological patterns can be made normal with logtransformation, so a good first step is to see what effect that has on our data. Data that can be made normal upon log-transformation are referred to as “lognormal.” As we explored in Chapter 3, plotting data as a density plot can also be useful, and so both will now be shown in Figure 5.3.

统计代写|应用统计代写applied statistics代考|Basic Statistical Analyses using R

1. 评估数据正态性
2. 一些基本的非参数统计
3. 学生 t 检验
4. 单因素方差分析 (ANOVA)
5. 我们现在已经了解了如何使用 qplot() 之类的函数以各种方式查看数据。例如，您可以绘制响应变量的直方图并查看其分布情况。但是，当您超越数据探索的初始步骤并开始考虑数据分析时，您应该问自己几个问题。其中最重要的是您拥有什么样的数据？

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|Histograms and density plots

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|Histograms and density plots

By default, if you plot just a single continuous variable using $q$ plot( $), R$ will plot a histogram (Figure 4.5, left). Histograms are extremely useful for seeing the distribution of your data. Do they look normally distributed? Are they skewed to one side or the other? There is technically no need to specify a “geom” here, but I think it is a good idea to be clear in your code, so I would recommend it.
Make a basic histogram
ac-qplot (data=Rxp. clean
$\mathrm{x}=$ Mass . final,
geom $=$ “histogram”)
The same principle of using the color or fill arguments as a way to view your data apply to histograms, but with one caveat. If you add a fill or color argument to a histogram in qplot ()$, \mathrm{R}$ will make a stacked histogram (Figure 4.5, middle). It can be more useful to see the data distributions overlayed on one another. This is best achieved with a density plot, which is similar to a histogram but instead plots a smoothed line that shows the shape of the data (Figure 4.5, right). Note that you should use the “col” argument in the density plot instead of the “fill” argument. What happens if you do not?
Hake a stacked histogram
be-qplot (data=RxP. clean,
$x=$ Mass.final,
geom=”histogram”,

Make a stacked histogram,

be-qplot (data=RxP.clean,
x=Mass. final,
geom= “histogram”,
fill=Pred)

Make overlayed density plots

ce-qplot (data=RxP. clean,
x=Mass.final,
geom=”density”,
fill=Pred)
#Make overlayed density plots
ce-qplot (data=RxP. clean,
$x=$ Mass. final,
geoms “density”,

统计代写|应用统计代写applied statistics代考|Scatterplots

The same principle works for continuous response variables. Previously, we defined our $\mathrm{x}$-axis as a categorical variable, but if we instead use a continuous variable $\mathrm{R}$ will plot a scatterplot. We can still use facets or colors to visualize the variation in our data, which is extremely useful. For example, in the following code I’ve filled the points based on the resource treatment, and faceted the data based on the predator treatment. Imagine the possibilities (Figure 4.6)!
Hake a serles of scatter plots
qplot (data=RxP. clean,
$x=\log$ (SVL. final),

Make a series of scatter

qplot (data=RxP. clean,
$\mathrm{x}=\log ($ SvL. final),
$\mathrm{y}=\log$ (Mass. final),
col=Res,
facets=. – Pred)
$y=\log ($ Mass. final),
col=Res,
facets=.-Pred)

Note that when we make a plot like this, many of the points end up on top of one another, making it difficult to see all the data. We can add an argument to set the alpha level, or the degree of translucency of the points to alleviate this issue. This is also particularly useful with density plots. For example, we can remake the density plots from above, but this time we will fill them instead of color them and set the alpha to be $0.5$ (Figure $4.7$ ).
qplot (data=RxP. clean,
$x=$ Mass . final,
geom=” density”,
fill=pred,
alpha $=0.5$ )
Note that setting the alpha level in qplot() makes the alpha level $50 \%$ transparent, no matter what value you enter. I will show you how to set it to whatever you want later.

In addition to visualizing our data by setting the fill or color to one of our variables, we can also change the shape of the points based on a variable in our data frame with the “shape $=$ ” argument (Figure 4.8).
Hake a series of scatter plots
qplot (data =kxp. clean,

Mke a series of scatter plots

qplot (data=RxP. clean,
x=log(SVL. final).
$x=\log$ (SVL. final),

In Chapter 3, we saw how to use functions from the dplyr package to summarize our data and we produced a tibble called RP.means that contained the means and standard errors for SVL.initial for each combination of resource and predator treatments. Now, we will see how to take those summarized data and turn them into a nice looking figure. In particular, we are going to make a bar graph. Why a bar graph you ask? There are several reasons really. First, despite their ubiquity in publications, the R gurus do not like bar graphs (or barplots, as we will call them) and making one is kind of a pain in $\mathrm{R}$. This is because bar graphs have an ability to hide a lot about your data (they just show the mean and whatever your error bar of choice is). But the fact that they are difficult to create makes them an excellent tool for teaching many of the ways you can, and probably should, customize your figures. That said, box plots are much more informative and are finally becoming increasingly used in published science. The second reason is that despite their downsides many people still like bar graphs and want to make them, so it is useful to know how to make one.

The most basic function to make a bar graph is barplot(). There are many, many arguments that can be passed to barplot(), which can be viewed in the help file (remember how to get to the help: ?barplot). A slightly improved version is the function barplot2(), which makes plotting error bars much simpler. barplot2() is found in the gplots package. You can also make a barplot in ggplot2. We will go through both examples. I find it useful and instructive to demonstrate the older technique using base graphics first, as it demonstrates how you can modify every little thing in a figure in R. Much of the coding techniques are also useful in ggplot2, and we will use ggplot2 for most everything in this book. If you feel confident

you will never, ever, ever use base graphics, feel free to skip ahead a few pages to the section on ggplot2. However, if you want to learn a little more about $\mathrm{R}$ and customizing figures, I encourage you to follow along with the next few pages of commands.
4.3.1 Making a barplot in base graphics
Let’s start by simply plotting the bars in their most basic form. Note that I am assuming you still have the RPmeans object we created in Chapter $3 .$

统计代写|应用统计代写applied statistics代考|Histograms and density plots

ac-qplot (data=Rxp.clean
X=大量的 。最后，

be-qplot (data=RxP.clean,
X=Mass.final,
geom=”histogram”,

制作堆叠直方图，

be-qplot（数据=RxP.clean，
x=Mass.final，
geom=“直方图”，

制作叠加密度图

ce-qplot (data=RxP. clean,
x=Mass.final,
geom=”density”,
fill=Pred) #制作

ce-qplot (data=RxP. clean,
X=质量最终，

统计代写|应用统计代写applied statistics代考|Scatterplots

qplot (data=RxP.clean,
X=日志（SVL。最终），

制作一系列散点图

qplot（数据=RxP。干净，
X=日志⁡(SVL。最终的），

col=Res，
facets=。– 预测）

col=Res,
facets=.-Pred)

qplot（数据=RxP。干净，
X=大量的 。最终，
geom=“密度”，

alpha=0.5)

Hake 一系列散点图
qplot (data =kxp.clean,

制作一系列散点图

qplot（数据=RxP.clean，
x=log（SVL.final）。
X=日志（SVL。最终），

4.3.1 在基本图形中制作条形图

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|Hex colors

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|Hex colors

Technically, R actually stores color in a system called hexadecimal, which defines each color (red, green, or blue) in terms of two values that range from 0-9, then from A-F, giving a total of 16 different values for each character. Thus, since each color is defined in terms of two number/letter combos, each color has 256 different possible values (the same as we see with RGB!!). Although the code for hexadecimal may not be intuitive, it is easy to look up online exactly what the code is for any color you want to use (just do an internet search for something like “RGB to Hex color”). The main advantage of the Hex system over RGB is that it is very compact to specify whatever color you want. In R, the hex color code goes in quotes and is preceded by a #. An additional two characters can be added to define a degree of translucency (aka, the alpha level). For example, the translucent red color defined previously would be as follows.

统计代写|应用统计代写applied statistics代考|DATA EXPLORATION USING ggplot2

In the last chapter, you learned a little bit about how to make fairly simple figures using base graphics, i.e., the graphics functions that are built into the version of R you downloaded from CRAN. One problem with base graphics is that the figures produced are relatively utilitarian and ugly (at least in many people’s view). There is a whole universe of functions and arguments you can use to make them look better, but in their basic version they are kind of boring and ugly. The relatively recently designed package ggplot2 makes it very easy to make nice looking figures. However, the syntax for coding in ggplot2 is a little different than base graphics. There are two workhorse functions in ggplot 2: the first is qplot() (which stands for “quick plot”) and the other is ggplot(). We will cover $g g p l o t()$ at a later date. For now, let’s explore qplot().

统计代写|应用统计代写applied statistics代考|Boxplots

Earlier, you made a boxplot using base graphics. The syntax for this was conveniently the same that we will use to define statistical models (“response $\sim$ predictor”). ggplot2 does things differently. Instead, you explicitly specify what variable you want on the $x$ or $y$ axes, and you specify the type of plot you want using the geom argument, which is short for geometric object. Don’t forget that in order to use the $q$ plot() function you first need to load the ggplot2 library. Also remember that you have to do this each time you restart $R$. In the following code I am using the number sign (#) to annotate the code. When you come back to your analyses in a week, a month, or a year, you need to have notes to remind yourself of what you were doing. Always leave notes in your script file for your future self.

Even in the most basic form, the figure made with ggplot2 (Figure $4.1$ ) is nicer looking (to many people), but this is not why ggplot2 is so useful. Where ggplot2 really shines is in its ability to add colors and to plot data across many different variables at once, as well as to easily take the same type of data and plot it in different ways within a single coding framework. For example, if you want to add colors to your figures, you can use either the “col” or “fill” arguments. In the case of a boxplot, “col” will change the color of the outline of the boxes whereas “fill” changes the color inside each one. The effects of “col” or “fill” will differ based on the particular type of plot you are making. One thing that is really cool about plotting with ggplot2 is that we define the colors as one of the variables in our dataset. R will be able to look at our data frame and know how many categories we have, and therefore how many colors to plot, and will even add a legend for us. So handy！

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|SUMMARIZING AND MANIPULATING DATA

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|SUMMARIZING AND MANIPULATING DATA

There are many tools in $\mathrm{R}$ to help you summarize your data efficiently. I think there are two functions worth knowing about at this stage: aggregate() from base $\mathrm{R}$ and summarize() from the dplyr package.
aggregate() is a very useful function that takes a column of raw data and summarizes it across one or more groups based on some chosen function (e.g., calculate the mean or the standard deviation, etc.). One nice think about using aggregate() is that you code the function of how you want your data summarized in the exact same format that you use for specifying plots or models (which we will cover in Chapter 5). The output of aggregate() is a data frame, which makes it easy to then use for plotting figures or other purposes. aggregate() takes three arguments: 1) the response and predictor variables, 2) the function you want to execute (with the “FUN=” argument), and 3) the data frame where the data can be found. For example, if you want to calculate the mean size of metamorphs at emergence across all combinations of Food and Resource treatments, you can type the following:

The package $d p l y r$ has many useful functions for data wrangling which we will cover in this book. By using some of these in combination, you can easily summarize your data. The utility of this may not be completely evident now, but it will be later, I promise (particularly in Chapter 9). The downside of dplyr is that, much like ggplot2, it has its own lexicon that is fairly distinct from the rest of $R$, meaning that you have to learn a whole

different set of commands. So be it. It’s pretty great once you learn the coding.

By using a few choice functions, such as group_by() and summarize(), we can easily say we want to calculate the means of whatever variable (or variables) we are interested in. Note of course that you can use different functions besides mean(), which is what we will start with here. The other important thing to note in the following code is the use of the pipe command, ‘ $\%>\%$ ‘, which designates the output of one line to be the input of the next.

统计代写|应用统计代写applied statistics代考|Introduction to Plotting

Before we get into the meat of how to most efficiently plot your data, it is useful to take a moment to talk about what makes a good figure. What is it that makes a high-quality figure, one that is suitable for publication or use in a presentation? I would argue that judicious use of color, large clear text and labels, and efficient usage of plot space are three hallmarks of a good figure. It is often useful to create multiple panels to show different aspects of your data. These fundamentals are the same whether you are making your figures in $R$ or not or using base graphics or ggplot2. If you are interested in a deeper dive into this topic, please check out the excellent book Fundamentals of Data Visualization by Claus O. Wilke, professor at The University of Texas (UT) Austin. He has written a particularly elegant and useful package called cowplot that provides not only a nice and easy to apply theme for ggplot2 to make

your figures look betterm, but it also contains many other functions which are useful.

There are two main ways to make figures in R: base graphics (i.e., those that are built into the base version of $R$ you downloaded from the Comprehensive R Archive Network (https://cran.r-project.org/) (CRAN) and using the package ggplot2. There are functions in other packages (e.g., the scatterplot() function in the car package, or the barplot2() function in the gplots package), but these all utilize the basic coding of base graphics. While ggplot 2 and base graphics use different coding styles, the fundamentals that make an effective graphic remain the same. Both types of coding allow you to build your graphics piece by piece and really give you control over every aspect of the figure.
Let’s talk about some basic nuts and bolts that are useful to know.

统计代写|应用统计代写applied statistics代考|Named colors

R has 657 named colors stored and ready to use. The full list of these can be viewed by typing colors() (or if you prefer, $\operatorname{colours}())$ at the command prompt. Using named colors is nice because you probably have an intuitive sense of what “slate grey” is going to look like, whereas a color defined by its red, green, and blue (RGB) values is less obvious. If you need to know what the exact RGB values of a named color are, simply use the function $\operatorname{col} 2 \mathbf{r g b}()$ and place the name of the color in quotes in the parentheses. For example, “col2rgb (‘yellow’)” would tell you the RGB values for yellow in $\mathrm{R}$.

$R$ allows you to define colors based on their red, green, and blue (RGB) values. This is particularly useful if you have a color scheme you want to match, perhaps for a presentation in MS PowerPoint or Apple Keynote, and thus you can specify the exact RGB color you are looking for (which might have come from that other program). Colors are defined using the rgb() function, which requires 3 arguments (not surprisingly they are the red, green, and blue values) as well as other optional arguments. The default is that all values have a minimum of 0 and a maximum of 1 , but you can set the max value to be 255 if you wish (which is likely how colors are defined in another program of your choosing). Lastly, you can also make an RGB color translucent by adjusting the alpha level (note that alpha is always on the same scale as the RGB values). For example, to set a color that was pure red but was $30 \%$ translucent (or conversely, $70 \%$ opaque), you would type the following.

统计代写|应用统计代写applied statistics代考|SUMMARIZING AND MANIPULATING DATA

aggregate() 是一个非常有用的函数，它采用一列原始数据并根据某些选择的函数（例如，计算平均值或标准差等）将其汇总到一个或多个组中。使用aggregate() 的一个很好的想法是，你可以用你用来指定图或模型的完全相同的格式来编码你希望如何汇总数据的函数（我们将在第5章中介绍）。aggregate() 的输出是一个数据框，这使得它很容易用于绘制图形或其他目的。aggregate() 接受三个参数：1) 响应变量和预测变量，2) 您要执行的函数（使用“FUN=”参数），以及 3) 可以找到数据的数据框。例如，如果您想计算所有食物和资源处理组合中出现时变质的平均大小，

统计代写|应用统计代写applied statistics代考|Named colors

R 存储了 657 种命名颜色，可供使用。可以通过键入 colors() 来查看这些的完整列表（或者，如果您愿意，颜色⁡())在命令提示符下。使用命名颜色很好，因为您可能对“板岩灰色”的外观有直观的感觉，而由其红色、绿色和蓝色 (RGB) 值定义的颜色则不太明显。如果您需要知道命名颜色的确切 RGB 值是什么，只需使用函数山口⁡2rGb()并将颜色的名称放在括号中的引号中。例如，“col2rgb (‘yellow’)”会告诉你黄色的 RGB 值R.

R允许您根据红色、绿色和蓝色 (RGB) 值定义颜色。如果您有想要匹配的配色方案，这可能特别有用，可能用于 MS PowerPoint 或 Apple Keynote 中的演示文稿，因此您可以指定您正在寻找的确切 RGB 颜色（可能来自其他程序）。颜色是使用 rgb() 函数定义的，它需要 3 个参数（毫不奇怪，它们是红色、绿色和蓝色值）以及其他可选参数。默认情况下，所有值的最小值为 0 ，最大值为 1 ，但您可以根据需要将最大值设置为 255（这可能是您选择的另一个程序中定义颜色的方式）。最后，您还可以通过调整 alpha 级别使 RGB 颜色半透明（请注意，alpha 始终与 RGB 值处于相同的比例）。例如，30%半透明（或相反，70%不透明），您将键入以下内容。

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|Exploratory Data Analysis and Data Summarization

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|Exploratory Data Analysis and Data Summarization

The purpose of this chapter is to introduce you to working with and manipulating data in R, exploring data, and plotting. I strongly believe in learning by doing, so let’s start doing some things so you can start learning!

In this book, we will use data from an experiment I conducted when I was a postdoc at the Smithsonian Tropical Research Institute in Panama in 2010 , and which was published in the journal Ecology in 2013 (https:// www.jstor.org/stable/23436298). The experiment was part of a National Science Foundation (NSF) funded project to Drs. Karen Warkentin (Boston University) and James Vonesh (Virginia Commonwealth University) studying the effects of flexible hatching timing by red-eyed treefrog (Agalychnis callidryas) embryos on interactions with predators and food levels and subsequent phenotype development of tadpoles. In order to follow along with the examples in this chapter, and the rest of the

book, you should download from the Github page for this book (https:// github.com/jtouchon/Applied-Statistics-with-R) a .csv file titled “RxP.csv”” The data are called by the short name “RxP” which stands for “Resourceby-Predation, which was the nature of the experiment (we were studying the interaction of resources and predators). This brings up a chance to reiterate a small but important point: since $\mathrm{R}$ is entirely based on typing commands by hand, you should give your datasets and variables short names so that they are quick and easy to type.
First, let’s get a handle on what the data are.

统计代写|应用统计代写applied statistics代考|READING IN THE DATA FILE

If you are loading a data file from somewhere on your computer, you will read it into the active workspace with the command read.csv(). If your dataset has categorical variables, you will want to include the argument stringsAsFactors= $T$, which tells the function to automatically make any columns that have character data in them factors. If the data are in your working directory, you would simply do the following (and remember, you should be working in a script window!). Note that you have to assign the data to an object. What happens if you do not? What is the working directory you ask? It is the directory that $R$ will look in by default. To find out where $R$ is looking, type getwd( $)$ at the prompt.

You can set your working directory with the function setwd( $)$, where you would put in the parenthesis a path to a folder on your computer. Make sure to put the path in quotes. Some folks like to set a working directory for each project they have. Others prefer to keep the working directly in a single place and just code the path to certain files. Choose whichever works for you. If your data are not in your working directory, you will need to specify exactly where to find the file on your computer.

统计代写|应用统计代写applied statistics代考|DATA EXPLORATION AND ERROR CHECKING

Whenever you start working with a dataset in $\mathrm{R}$, you should first devote substantial time to checking it for errors. Questions you should ask yourself include:

• Did the data import correctly?
• Are the column names correct?
• Are the types of data appropriate? (e.g., factor vs numerical)
• Are the numbers of columns and rows appropriate?
• Are there typos?
If, for example, a column that is supposed to be numerical shows up as a factor, that likely indicates a typo where you accidentally have text in place of a number (remember, each column in a data frame is a vector, and vectors can only have one mode, so a vector with both numbers and characters is treated as if it is all characters). Similarly, if you have a factor that should have 3 categories, but imports with 4 , you likely have a typo (e.g., “predator” vs “predtaor”), and the misspelled version is showing up as a separate category. These sorts of mistakes are very common!

Because this dataset has been thoroughly examined (very thoroughly!), these types of errors are not present. However, you might want to change the names of columns or remove outliers, which we will cover in the subsequent sections.

统计代写|应用统计代写applied statistics代考|DATA EXPLORATION AND ERROR CHECKING

• 数据是否正确导入？
• 列名是否正确？
• 数据类型是否合适？（例如，因子与数值）
• 列数和行数是否合适？
• 有错别字吗？
例如，如果一个应该是数字的列显示为一个因素，这可能表示您不小心用文本代替数字的错字（请记住，数据框中的每一列都是一个向量，向量可以只有一种模式，因此同时包含数字和字符的向量被视为全部字符）。类似地，如果您有一个应该有 3 个类别的因子，但使用 4 导入，则您可能有拼写错误（例如，“predator”与“predtaor”），并且拼写错误的版本显示为单独的类别。这类错误很常见！

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|BEST PRACTICES FOR DATA ANALYSIS

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|BEST PRACTICES FOR DATA ANALYSIS

Once you have started analyzing your data, how should you actually go about the process? By that I don’t mean “sit at your computer and type code into $R$,” but what is the best way to think about running your analyses and coming to conclusions about your data?

The most important rule of the road is to have clear hypotheses at the outset of analyzing your data. You ran the experiment (presumably) and so you should have an idea of what questions you are trying to answer. It is important that you do not just try to measure everything under the

sun and then look for possible relationships between your variables. If you measure enough things, you are more likely to find a statistically significant relationship that may or may not be meaningful. This is a process known as $p$-hacking or data dredging. Looking for patterns in your data is certainly an okay thing to do, but you should avoid testing every possible predictor you can imagine in the hope of finding something significant that will make your data more publishable. This is particularly true when you have many possible predictor variables.

A second dubious practice which is to be avoided at all costs is HARKing, which stands for “hypothesizing after the results are known.” HARKing basically goes like this: 1) You start with a hypothesis that you want to test; 2) you test it and find a non-significant effect; 3) while doing your series of analyses you come across some other variable which does have a significant explanatory effect; and 4), you omit your original hypothesis and concoct a new hypothesis which fits the results of the study. Essentially, you are coming up with a prediction after you know the answer. It is certainly important to realize that you might learn something new when you analyze your data. Perhaps the unforeseen result causes you to see a new explanatory relationship which you had not considered before. That’s fine, but it is important to be transparent about your original hypothesis versus the new hypothesis. Moreover, you should not be testing all sorts of predictors that may or may not seem important and then trying to come up with explanatory hypotheses after the fact.

The last piece of advice regarding best practices is dont obsess over statistical significance. It has been said many times, but it bears repeating, that the $0.05$ cutoff that is traditionally used to demarcate “significance” is a completely arbitrary line in the sand. Why isn’t it $0.01$ or $0.10$ ? Obviously, the idea of statistical significance is still an oft-used metric in many fields, and we will certainly talk about significance in this book, but don’t obsess over it. It isn’t the be-all and end-all of your research endeavors, because lots of things affect your ability to detect “significance” beyond the validity of your hypotheses, the magnitude of biological affects you measure, or the strength of your data. The more data you have, the greater your ability to detect very small affects. Similarly, with very little data, you will have a hard time detecting even moderate affects. The more variable your data are from individual to individual, the harder a time you will have detecting significance. So, focus your energy on understanding and estimating the explanatory power of your predictor variables. What you want to know is if there is a relationship between your predictor and response variables and how confident you can be in any relationship you’ve tried to quantify, so try to keep that in your mind at all times while analyzing your data.

统计代写|应用统计代写applied statistics代考|HOW TO DECIDE BETWEEN COMPETING ANALYSES

Something we will explore later in this book with practical, real-world examples is how to choose amongst different possible ways to analyze your data. However, it is still useful to think about this issue up front, because it is an important one. Many times, you will be faced with different possible ways to analyze your data and different methods might have certain pros and cons.

At the end of the day, your job is to pick the statistical analysis that does the best job of representing and interpreting your data. Getting you to the point where you know how to evaluate your data and make that decision is one of the major goals of this book. Sometimes there will be two different types of models that may be essentially equivalent. They may fit equally well, and they may give you very similar answers about the significance of your predictors. Sometimes you might have to choose amongst competing models that give you different answers about the supposed significance (or not) of your predictors. How do you choose which model to go with?

You should be completely neutral about the idea of “significance” and only consider if the model is doing a good job and is appropriate for the data. As we will see in Chapter 7, there are times when you might have a model which might seem appropriate and gives you a highly significant result, but which actually fits very poorly. The opposite can be true as well, where you choose a model that fits poorly and gives you an underwhelming sense of the importance of your predictors. At the end of the day, you need to be able to justify your choice of model to yourself, your reviewers, your supervisors, your colleagues, and anyone else that might look at your analysis. You need to be able to stand behind your decision and explain why you chose one analysis over another. If you can do that, you’ll be in good shape.

统计代写|应用统计代写applied statistics代考|DATA ARE DATA ARE DATA

The data that we will work within this book come from a study I conducted as a postdoc working in the rainforests of Panama at the Smithsonian Tropical Research Institute. I have studied frogs and their eggs and tadpoles since I began my PhD in 2002. I love frogs and their eggs and tadpoles. I think they are the greatest. But, maybe you dont, and maybe you are thinking to yourself: “How do these weird tadpole data relate to me and my analyses? I’m never going to study tadpoles in Panama!” While that might be true, I think that data are data are data, and the spatially nested design of the study we will work through is extremely similar to many other study systems (Figure 2.2).

The shorthand name of the dataset we will work with is “RxP” which stands for “Resource X Predation,” which are the two main things we were manipulating in the study (Chapter 3 will describe the study in more detail). In the RxP dataset, we have individuals that are nested (or grouped, if you will) within tanks, and those tanks are themselves nested within blocks. This is the same as, for example, having multiple mice housed together in a single cage, and then having cages grouped together on different shelves. Or, you might imagine multiple fruit flies measured per vial, and vials are grouped together based on the date they were set up. Similarly, we can think about having multiple genetic strains (i.e., genotypes, families, etc.) that cluster our data into discrete groups. This sort of nested design can also be used to think about repeated measures data, where individuals are measured or sampled repeatedly over time.

Beyond the idea of physically clustered data, we can just think about the tanks as different replicates of an experiment, which is no different from any other experiment. The response variables (also called the “dependent variables”) that were measured during the RxP experiment-things like number of days until metamorphosis or size at metamorphosis-are no different than any other continuous variable that might be measured, be it the length of time a rat freezes after seeing a light flash or the number of pecks a pigeon makes before correctly opening a lever or the number of new leaves produced after adding fertilizer to a growing plant. The predictor variables (also called the “independent variables”) are generally categorical in their nature (i.e., they have several discrete categories), and could easily be replaced by drug treatment or maternal enrichment environment or fertilization regime. As I said earlier, data are data are data. See Figure $2.2$ for a visual depiction of the similarities in data structure.

统计代写|应用统计代写applied statistics代考|BEST PRACTICES FOR DATA ANALYSIS

sun，然后寻找变量之间的可能关系。如果你测量了足够多的东西，你更有可能找到一个可能有意义也可能没有意义的统计显着关系。这是一个被称为p-黑客攻击或数据挖掘。在你的数据中寻找模式当然是一件好事，但你应该避免测试你能想象到的每一个可能的预测变量，以期找到一些重要的东西，让你的数据更容易发布。当您有许多可能的预测变量时尤其如此。

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|Before You Begin

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|aka Thoughts on Proper Data Analysis

Before we embark on the journey that is learning $R$ and how to use it to analyze your data and make fantastic figures, it is useful to stop and think a little bit about best practices for data analysis.
$2.2$ BASIC PRINCIPLES OF EXPERIMENTAL DESIGN
If there are three words to remember when thinking about experimental design, they are balance, randomization, and replication. In a nutshell, what you are trying to prevent with these three factors is your data being correlated in some way that is unhelpful to your analysis. You are also trying to ensure that your data are independent from one another and that you have enough data to actually determine if your treatments did anything or not (see Boxes 2.1, 2.2, and 2.3).

Obviously, it is not always possible to control these aspects of your data, particularly if you have observational data (as compared to a controlled experiment which you design and run yourself). But, even in the case of observational studies, these principles are important to keep in mind and consider.

统计代写|应用统计代写applied statistics代考|BLOCKED EXPERIMENTAL DESIGNS

Consider the four experimental setups shown in Figure 2.1. Imagine that we are now testing the effects of four fertilizers on plant growth (labelled A, B, C, and D), each with 12 individuals. The experiment is conducted in four separate “blocks” What is a block? It could be many things. Maybe it is a physical way of setting up the experiment, for example, four shelves in an incubator that contain the experimental units or four rooms that contain the cages our individuals live in. Maybe, due to space or time limitations, only 12 individuals can be tested or measured at a time, and thus the experiment has to be run four separate times. Each of these can be considered a “block” so you can hopefully imagine how this idea relates to your own research. Blocks are only important to consider if there is some systematic difference among them.

In the first example, the four treatments will be perfectly correlated with the four blocks. Thus, if we imagine a significant difference is detected in one treatment, there is no way to know if it is because of the experimental treatment or if there was something else going on in that block (or room, or time point, or whatever you want to imagine that block represents). Once again, this is an example of pseudoreplication because it seems like we have a large sample size but in reality, we have a sample size of $\mathrm{N}=1$ in each of our treatments. Despite growing 48 different plants, this design is unreplicated.
The second example is a fully randomized design, where the four treatments are allocated across the four blocks completely at random. The third example is a fully balanced design, where each of the four treatments is assigned to each block in the same manner. Each of these setups has its own advantages and disadvantages.

The fully randomized design is good, and in theory should lead to the highest degree of replication, with all experimental units being truly independent. In reality it can actually work against the principle of balance, since some treatments might end up overrepresented in some blocks and underrepresented in others (e.g., in Figure 2.1, there are six individuals from treatment B in Block 2, but only one in Block 3). In the extreme, leaving your entire setup to random chance could lead to a horribly unbalanced and biased design, but this would be very rare. In general, fully randomizing your experiment is a very good idea!

统计代写|应用统计代写applied statistics代考|YOU CAN (AND SHOULD) PLAN YOUR ANALYSES BEFORE YOU HAVE THE DATA!

In addition to the aspects of experimental design described previously, the other most important thing to do is to have a clear idea of your predictor and response variables before you even start the experiment. Before you ever put a mouse in a testing box or a seed in growth chamber, you should identify what it is you are going to measure. Hopefully, if you know your study system pretty well or perhaps have some preliminary data, you can estimate what the data are going look like which will allow you to think about and plan for what type of analyses you will do. Maybe that sounds like wishful thinking, but this whole book is about the importance of knowing what your data look like, so don’t worry-you’ll get there!

Sir Ronald Fisher, one of the founders of modern statistics, offered one of the best statements about this issue in $1938 .$ Fisher’s point is that because your experimental design directly effects your data analysis, you should think about your analysis up front when planning the experiment.

2.2实验设计的基本原则

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

Spreadsheets (MS Excel, Google Sheets, etc.) are very useful for entering data, but not necessarily for analyzing data. Their flexible nature enables the user to enter all sorts of different pages with a variety of notes and a way to store those data. You can do things like color code individual cells or columns. In my experience, people are often terrible at organizing their data in a way that makes it useful for analysis. Most of the time, folks view their spreadsheets as a simple place to dump information. Your Excel file is not your scrapbook! For example, look at Figure 1.3.

On one level, this might seem like an intuitive way to enter our data. We can clearly see that Susan measured the animals in Block 1 and Darren measured the animals in Block 2. We can see that each person measured two tanks per block, and they measured 4 animals in each tank. Great right? But if you look closer, you can see that Susan called the four animals in each tank the same things (tadpoles 1-4), whereas Darren gave them unique IDs (tadpoles 1-8). Susan used lowercase letters to abbreviate snout-vent length

(SVL) for tank 1 but capitalized it for tank 2. Darren forgot to include a space in between “Tadpole” and “6.” $R$ will treat typos like these as separate and independent, causing problems. Evidently Susan didn’t measure the tails in tank 2 at all and just left the cells blank. Darren had two tadpoles without measurable tails and wrote the word “none” in each cell. All of these sorts of things would make the data impossible to analyze.

A much better way to organize these data is shown in Figure 1.4. What you want to aim for is one observation per row, which places data into a relatively long format. In this version, you have a separate column for each type of measurement you have taken and a separate row for each individual that has been measured. Each individual is given a unique identifier. NAs are used in place of any missing data. It might seem weird to have things repeated on many lines, such as the name of the measurer (Susan vs Darren) but you want each row to have all the information necessary to identify it.

统计代写|应用统计代写applied statistics代考|UNDERSTANDING VARIOUS TYPES OF OBJECTS IN R

There are a number of different types of objects in $R$, and it is important to understand how each of these work (Table 1.1). There are certainly more types of objects than these, but these are the foundational objects you need to understand for now. For each of these types of objects (and most anything in R), we can ask $R$ what kind of object it is by using the str() function (short for “structure”).
1.9.1 Vector
A vector has only a single dimension, it is a sequence of elements that are all the same type. The length of the vector is defined by the number of elements in the vector. All of these must be the same mode (hopefully you remember what the mode is from just a few pages ago!).
$\operatorname{str}(r 1)$

num $[1: 100] \quad 6.87 \quad 4.38 \quad 4.38 \quad 6.58 \quad 8.62 \ldots$

The structure of object $\mathbf{r l}$ tells us it is a numeric vector with 100 elements, and it gives us the first five elements. We can also ask R directly if $\mathbf{r l}$ is a vector.

1. vector $(r 1)$

统计代写|应用统计代写applied statistics代考|Matrix

A matrix is essentially a vector that has been given an additional attribute, which is just where to wrap around to create multiple rows or columns. Thus, it’s a vector that has a 2-dimensional structure. Since it is basically just a fancy vector, all the elements in a matrix still need to be of the same mode (e.g. “numeric,” “logical,” etc.).

To create a matrix, we can specify the data to start off with, plus the number of rows and columns and if the data should be wrapped based on rows or columns (with the “byrow=” argument). Note that there is a “bycol $=$ ” argument which does the opposite of “byrow=.” Also note that by saying you don’t want to wrap by row (“byrow=FALSE”) you are doing the exact same thing as saying “bycol=TRUE.”

A data frame is probably the most useful and most used of the objects we will discuss in this book. I know I said earlier that vectors are the most important, and they are, but a data frame is essentially a table composed of one or more vectors. Thus, if you can understand vectors you can understand data frames. All of the vectors in a data frame have to have the same length (which is important), but the data in those vectors can be different modes (which is also important). We will learn how to read in data in Chapter 3. For now, let’s create a data frame from scratch, which also provides an opportunity to introduce some useful basic functions.
Let’s make a vector of values and a vector of names (you might imagine they are treatment groups, for example). We can use the function rep() to repeat something as many times as we want. For example, let’s imagine we have two treatments each with 20 individuals, and that the average value of whatever we have measured is 5 for one group and 10 for the other group. In the code below we have nested several functions to achieve what we want to do. In the first line, we use the $\boldsymbol{c}$ () function to concatenate the words Group.A and Group.B into a single vector, then use the rep() function to repeat each value in that vector 20 times. What happens if you replace the argument “each=” with “times=?” In the second line, we concatenate together two vectors that are each 20 numbers long. In the third line we use the function data.frame() to create the new data frame from our two vectors and store it as an object called df .

As discussed previously, there is nothing special about the names chosen here. We could have called our vectors whatever we wanted, but the names treatment and values are sensible names to use for our purposes. Note that making the data frame dfl from the two vectors only works because we had already created the objects treatment and values. If you called your vectors “vecl” and “vec2,” you would need to modify the code where you create the data frame accordingly.

Earlier we use the function str() to look at the structure of a vector. The same function works for getting a quick look at a data frame. I think you will find that $\operatorname{str}()$ is one of the most useful functions there is. It quickly tells you what sort of data are found in each column in your data frame, as well as the size of the data frame. The number of “obs.” is the number of rows in the data frame and the number of “variables” is the number of columns.

应用统计代写

(SVL) 用于坦克 1，但将其大写用于坦克 2。达伦忘记在“蝌蚪”和“6”之间添加一个空格。R会将此类拼写错误视为单独和独立的，从而导致问题。显然，苏珊根本没有测量 2 号坦克的尾部，只是将单元格留空。达伦有两只没有可测量尾巴的蝌蚪，并在每个牢房中写下“无”字样。所有这些事情都会使数据无法分析。

1.9.1 向量

1. 向量(r1)

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

统计代写|应用统计代写applied statistics代考|Introduction to R

statistics-lab™ 为您的留学生涯保驾护航 在代写应用统计applied statistics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写应用统计applied statistics代写方面经验极为丰富，各种代写应用统计applied statistics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

统计代写|应用统计代写applied statistics代考|Introduction to R

The purpose of this first chapter is to introduce you to the basic workings of $R$ and get you up to speed. Some of this material might be familiar to you if you’ve used $R$ before, but the goal is to get anyone reading the book up to a basic level of familiarity. You will learn many of the basic and very important functions of $R$, such as:

• Creating objects
• Writing articulate R code
• Using functions
• Generating artificial data
• Entering data in a format that can be read and analyzed by $R$
This chapter does not intend to be an exhaustive introduction to all the basic workings of $R$. In other words, we’ll move pretty quickly here. If you would like a greater introduction, I highly recommend checking out the excellent book Getting Started With R: An Introduction for Biologists by Andrew Beckerman, Dylan Childs, and Owen Petchey.

$R$ is designed to be a small program (currently just about $80 \mathrm{mb}$ ) which makes it easy to download and install anywhere in the world. The base version of $R$ contains a great number of functions for organizing and analyzing data, but the real strength comes in what are called packages. Packages are freely downloadable additions to $R$ that provide new functions and datasets for particular analyses. For example, the base version of $R$ can conduct linear models and generalized linear models (Chapters $5-7$ ) but cannot conduct mixed effects models (Chapter 8). To do mixed effects models, you need to download a specific package (of which there are several).

The only important thing to remember about packages is that adding them to $\mathrm{R}$ is a two-step process. First, you have to install a package, which (perhaps counterintuitively) just downloads the package to your computer.Secondly, you have to load the package, which is when you have actively placed it in the current memory for use. You will generally obtain packages from the Comprehensive R Archive Network (https://cran.r-project.org/) (CRAN) directly through $R$.

统计代写|应用统计代写applied statistics代考|WORKING FROM THE SCRIPT WINDOW

The biggest mistake that most new $\mathrm{R}$ users make is to just type commands into the command prompt. The problem with this is that once you hit enter the command is gone. If you hit the up-arrow, $\mathrm{R}$ will scroll through the previously executed commands, but aside from this what you typed is gone and it cannot be edited! It is of course reasonable to run lines from the command line from time to time, but it is much better to work from a script window.

The script window allows you to easily save and edit your code, and to execute one or multiple lines of code at once. To open a blank script window, go to the File menu and click on New Document, or just hit command- $\mathrm{N}$ (Mac) or control- $\mathrm{N}$ (PC) on your keyboard.

In the script window you can type in your commands and then execute them by hitting command-enter (Mac) or control-R (PC). This means you type code into the script window and then the program sends the line of

code to the command prompt for you. Do not cut and paste code from the script window to the command prompt; that is a waste of time. You can also highlight multiple lines of code and execute them all at once. To save your code simply go to the File menu and save as you would any other file (or just hit command-S or control-S on your keyboard).

A script allows you to edit, run, and tweak your code, save it, return to it later, send it collaborators or mentors, and so on. Anything you think will want to run more than once, or that you might want to edit, should be typed into a script window (which is pretty much everything).

统计代写|应用统计代写applied statistics代考|Introduction to R

• 创建对象
• 编写清晰的 R 代码
• 使用函数
• 生成人工数据
• 以可读取和分析的格式输入数据R
本章并不打算详尽介绍所有的基本工作原理。R. 换句话说，我们将很快地移动到这里。如果您想要更详细的介绍，我强烈建议您阅读 Andrew Beckerman、Dylan Childs 和 Owen Petchey 撰写的优秀书籍 R 入门：生物学家介绍。

R被设计成一个小程序（目前大约80米b) 这使得在世界任何地方都可以轻松下载和安装。的基础版本R包含大量用于组织和分析数据的功能，但真正的优势在于所谓的包。软件包是可免费下载的附加组件R为特定分析提供新功能和数据集。例如，基础版本R可以进行线性模型和广义线性模型（章节5−7) 但不能进行混合效应模型（第 8 章）。要做混合效果模型，你需要下载一个特定的包（其中有几个）。

有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。