分类：机器学习/统计学习代写

计算机代写|机器学习代写machine learning代考|COMP3308

Posted on 2023年8月23日2023年10月10日 by statistics-lab

如果你也在怎样代写机器学习Machine Learning 这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。机器学习Machine Learning令人兴奋。这是有趣的，具有挑战性的，创造性的，和智力刺激。它还为公司赚钱，自主处理大量任务，并从那些宁愿做其他事情的人那里消除单调工作的繁重任务。

机器学习Machine Learning也非常复杂。从数千种算法、数百种开放源码包，以及需要具备从数据工程(DE)到高级统计分析和可视化等各种技能的专业实践者，ML专业实践者所需的工作确实令人生畏。增加这种复杂性的是，需要能够与广泛的专家、主题专家(sme)和业务单元组进行跨功能工作——就正在解决的问题的性质和ml支持的解决方案的输出进行沟通和协作。

statistics-lab™ 为您的留学生涯保驾护航在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

计算机代写|机器学习代写machine learning代考|Bias and Variance Analysis

Bias and variance analysis plays an important role in ensemble classification research due to the framework it provides for classifier prediction error decomposition.

Initially, Geman has decomposed prediction error into bias and variance terms under the regression setting using squared-error loss [10]. This decomposition brings about the fact that a decrease/increase in the prediction error rate is caused by a decrease/increase in bias, or in variance, or in both. Extensions of the analysis have been carried out on the classification setting, and later applied on different ensemble classifiers in order to analyse the reason behind their success over single classifiers. It has been shown that the reason for most of the ensembles to have lower prediction error rates is due to the reductions they offer in sense of both bias and variance.
However, the extension of the original theoretical analysis on regression has been done in various ways by different researchers for classification; and there is no standard definition accepted. Therefore, the results of the analyses also differ from each other slightly. Some of the definitions/frameworks that have gained interest within the research field are given by Breiman [3], Kohavi and Wolpert [15], Dietterich and Kong [16], Friedman [9], Wolpert [25], Heskes [11], Tibshirani [19], Domingos [6] and James [13].

Although there are dissimilarities in-between the frameworks, the main intuitions behind each are similar. Consider a training set $T$ with patterns $\left(x_i, l_i\right)$, where $x$ represents the feature vector and $l$ the corresponding label. Given a test pattern, an optimal classifier model predicts a decision label by assuring the lowest expected loss over all possible target label values. This classifier, which is actually the Bayes classifier when used with the zero-one loss function, is supposed to know and use the underlying likelihood probability distribution for the input dataset patterns/classes. If we call the decision of the optimal classifier as the optimal decision $(O D)$, then for a given test pattern $\left(x_i, l_i\right), O D=\operatorname{argmin}_\alpha E_t[L(t, \alpha)]$ where $L$ denotes the loss function used, and $l$ the possible target label values.

The estimator, on the other hand, is actually an averaged classifier model. It predicts a decision label by assuring the lowest expected loss over all labels that are created by classifiers trained on different training sets. The intrinsic parameters of these classifiers are usually the same, and the only difference is the training sets that they are trained on. In this case, instead of minimizing the loss over the target labels using the known underlying probability distribution as happens in the optimal classifier case, the minimization of the loss is carried out for the set of labels which are created by the classifiers trained on various training sets. If the decision of the estimator is named as the expected estimator decision $(E E D)$; then for a given test pattern $\left(x_i, l_i\right), E E D=\operatorname{argmin}_\alpha E_l[L(l, \alpha)]$ where $L$ denotes the loss function used, and $l$ the label values obtained from the classifiers used. For regression under the squared-error loss setting, the $O D$ is the mean of the target labels while the EED is the mean of the classifier decisions obtained via different training sets.

计算机代写|机器学习代写machine learning代考|Bias and Variance Analysis of James

James [13] extends the prediction error decomposition, which is initially proposed by Geman et al [10] for squared error under regression setting, for all symmetric loss functions. Therefore, his definition also covers zero-one loss under classification setting, which we use in the experiments.

In his decomposition, the terms systematic effect (SE) and variance effect (VE) satisfy the additive decomposition for all symmetric loss functions, and for both real valued and categorical predictors. They actually indicate the effect of bias and variance on the prediction error. For example, a negative $V E$ would mean that variance actually helps reduce the prediction error. On the other hand, the bias term is defined to show the average distance between the response and the predictor; and the variance term refers to the variability of the predictor. As a result, both the meanings and the additive characteristics of the bias and variance concepts of the original setup have been preserved. Following is a summary of the bias-variance derivations of James:

For any symmetric loss function $L$, where $L(a, b)=L(b, a)$ :
$$
\begin{aligned}
E_{Y, \tilde{Y}}[L(Y, \tilde{Y})]= & E_Y[L(Y, S Y)]+E_Y[L(Y, S \tilde{Y})-L(Y, S Y)] \
& +E_{Y, \tilde{Y}}[L(Y, \tilde{Y})-L(Y, S \tilde{Y})] \
\text { predictionerror }= & \operatorname{Var}(Y)+S E(\tilde{Y}, Y)+\operatorname{VE}(\tilde{Y}, Y)
\end{aligned}
$$
where $L(a, b)$ is the loss when $b$ is used in predicting $a, Y$ is the response and $\tilde{Y}$ is the predictor. $S Y=\operatorname{argmin}\mu E_Y[L(Y, \mu)]$ and $S \tilde{Y}=\operatorname{argmin}\mu E_Y[L(\tilde{Y}, \mu)]$. We see here that prediction error is composed of the variance of the response (irreducible noise), $S E$ and $V E$.

Using the same terminology, the bias and variance for the predictor are defined as follows:
$$
\begin{aligned}
\operatorname{Bias}(\tilde{Y}) & =L(S Y, S \tilde{Y}) \
\operatorname{Var}(\tilde{Y}) & =E_{\tilde{Y}}[L(\tilde{Y}, S \tilde{Y})]
\end{aligned}
$$
When the specific case of classification problems with zero-one loss function is considered, we end up with the following formulations:
$L(a, b)=I(a \neq b), Y \varepsilon{1,2,3 . . N}$ for an $N$ class problem, $P_i^Y=P_Y(Y=i), P_i^{\tilde{Y}}=$ $P_{\tilde{Y}}(\tilde{Y}=i), S T=\operatorname{argmin}i E_Y[I(Y \neq i)]=\operatorname{argmax}_i P_i^Y$ Therefore, $$ \begin{aligned} \operatorname{Var}(Y) & =P_Y(Y \neq S Y)=1-\max _i P_i^Y \ \operatorname{Var}(\tilde{Y}) & =P{\tilde{Y}}(\tilde{Y} \neq S \tilde{Y})=1-\max i P_i^{\tilde{Y}} \ \operatorname{Bias}(\tilde{Y}) & =I(S \tilde{Y} \neq S Y) \ \operatorname{VE}(\tilde{Y}, Y) & =P(Y \neq \tilde{Y})-P_Y(Y \neq S \tilde{Y})=P{S \tilde{Y}}^Y-\sum_i P_i^Y P_i^{\tilde{Y}} \
S E(\tilde{Y}, Y) & =P_Y(Y \neq S \tilde{Y})-P_Y(Y \neq S Y)=P_{S Y}^Y-P_{S \tilde{Y}}^Y
\end{aligned}
$$
where $I(q)$ is 1 if $q$ is a true argument and 0 otherwise.

机器学习代考

计算机代写|机器学习代写machine learning代考|Bias and Variance Analysis

偏差和方差分析为分类器预测误差分解提供了框架，在集成分类研究中起着重要的作用。

最初，german使用误差平方损失将回归设置下的预测误差分解为偏差项和方差项[10]。这种分解带来的事实是，预测错误率的减少/增加是由偏差或方差的减少/增加引起的，或者两者兼而有之。本文对分类设置进行了扩展分析，随后将其应用于不同的集成分类器，以分析它们优于单一分类器的原因。已经表明，大多数集成具有较低的预测错误率的原因是由于它们在偏差和方差的意义上提供了减少。
然而，不同的研究人员以不同的方式对回归的原始理论分析进行了扩展，用于分类;目前还没有公认的标准定义。因此，分析结果也略有不同。Breiman[3]、Kohavi and Wolpert[15]、Dietterich and Kong[16]、Friedman[9]、Wolpert[25]、Heskes[11]、Tibshirani[19]、Domingos[6]和James[13]给出了一些在研究领域引起兴趣的定义/框架。

尽管框架之间存在差异，但每个框架背后的主要直觉是相似的。考虑一个模式为$\left(x_i, l_i\right)$的训练集$T$，其中$x$表示特征向量，$l$表示相应的标签。给定一个测试模式，最优分类器模型通过确保所有可能的目标标签值的最低预期损失来预测决策标签。当与0 – 1损失函数一起使用时，这个分类器实际上是贝叶斯分类器，它应该知道并使用输入数据集模式/类的潜在可能性概率分布。如果我们将最优分类器的决策称为最优决策$(O D)$，那么对于给定的测试模式$\left(x_i, l_i\right), O D=\operatorname{argmin}_\alpha E_t[L(t, \alpha)]$，其中$L$表示使用的损失函数，$l$表示可能的目标标签值。

另一方面，估计器实际上是一个平均分类器模型。它通过确保在不同训练集上训练的分类器创建的所有标签上的最低预期损失来预测决策标签。这些分类器的内在参数通常是相同的，唯一的区别是它们所训练的训练集。在这种情况下，与在最优分类器情况下使用已知的潜在概率分布最小化目标标签上的损失不同，对由在各种训练集上训练的分类器创建的标签集进行最小化损失。如果估计器的决策被命名为预期估计器决策$(E E D)$;然后对于给定的测试模式$\left(x_i, l_i\right), E E D=\operatorname{argmin}_\alpha E_l[L(l, \alpha)]$，其中$L$表示使用的损失函数，$l$表示从使用的分类器获得的标签值。对于平方误差损失设置下的回归，$O D$为目标标签的均值，EED为通过不同训练集得到的分类器决策的均值。

计算机代写|机器学习代写machine learning代考|Bias and Variance Analysis of James

James[13]将Geman等[10]最初针对回归设置下的平方误差提出的预测误差分解扩展到所有对称损失函数。因此，他的定义也涵盖了我们在实验中使用的分类设置下的0 – 1损失。

在他的分解中，系统效应(SE)和方差效应(VE)两项满足所有对称损失函数的加性分解，对于实值和分类预测器都是如此。它们实际上表明了偏差和方差对预测误差的影响。例如，负值$V E$意味着方差实际上有助于减少预测误差。另一方面，定义偏差项来表示响应和预测器之间的平均距离;方差项指的是预测器的可变性。结果，保留了原始设置的偏差和方差概念的含义和可加性特征。下面是James的偏方差推导的总结:

对于任意对称损失函数$L$，其中$L(a, b)=L(b, a)$:
$$
\begin{aligned}
E_{Y, \tilde{Y}}[L(Y, \tilde{Y})]= & E_Y[L(Y, S Y)]+E_Y[L(Y, S \tilde{Y})-L(Y, S Y)] \
& +E_{Y, \tilde{Y}}[L(Y, \tilde{Y})-L(Y, S \tilde{Y})] \
\text { predictionerror }= & \operatorname{Var}(Y)+S E(\tilde{Y}, Y)+\operatorname{VE}(\tilde{Y}, Y)
\end{aligned}
$$
当$b$用于预测时，$L(a, b)$是损失$a, Y$是响应，$\tilde{Y}$是预测者。$S Y=\operatorname{argmin}\mu E_Y[L(Y, \mu)]$和$S \tilde{Y}=\operatorname{argmin}\mu E_Y[L(\tilde{Y}, \mu)]$。我们在这里看到，预测误差由响应的方差(不可约噪声)，$S E$和$V E$组成。

使用相同的术语，预测器的偏差和方差定义如下:
$$
\begin{aligned}
\operatorname{Bias}(\tilde{Y}) & =L(S Y, S \tilde{Y}) \
\operatorname{Var}(\tilde{Y}) & =E_{\tilde{Y}}[L(\tilde{Y}, S \tilde{Y})]
\end{aligned}
$$
当考虑具有0 – 1损失函数的分类问题的具体情况时，我们得到如下公式:
$L(a, b)=I(a \neq b), Y \varepsilon{1,2,3 . . N}$为$N$类问题，$P_i^Y=P_Y(Y=i), P_i^{\tilde{Y}}=$$P_{\tilde{Y}}(\tilde{Y}=i), S T=\operatorname{argmin}i E_Y[I(Y \neq i)]=\operatorname{argmax}i P_i^Y$因此，$$ \begin{aligned} \operatorname{Var}(Y) & =P_Y(Y \neq S Y)=1-\max _i P_i^Y \ \operatorname{Var}(\tilde{Y}) & =P{\tilde{Y}}(\tilde{Y} \neq S \tilde{Y})=1-\max i P_i^{\tilde{Y}} \ \operatorname{Bias}(\tilde{Y}) & =I(S \tilde{Y} \neq S Y) \ \operatorname{VE}(\tilde{Y}, Y) & =P(Y \neq \tilde{Y})-P_Y(Y \neq S \tilde{Y})=P{S \tilde{Y}}^Y-\sum_i P_i^Y P_i^{\tilde{Y}} \ S E(\tilde{Y}, Y) & =P_Y(Y \neq S \tilde{Y})-P_Y(Y \neq S Y)=P{S Y}^Y-P_{S \tilde{Y}}^Y
\end{aligned}
$$
如果$q$为真参数，$I(q)$为1，否则为0。

计算机代写|机器学习代写machine learning代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

金融工程是使用数学技术来解决金融问题。金融工程使用计算机科学、统计学、经济学和应用数学领域的工具和知识来解决当前的金融问题，以及设计新的和创新的金融产品。

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

术语广义线性模型（GLM）通常是指给定连续和/或分类预测因素的连续响应变量的常规线性回归模型。它包括多元线性回归，以及方差分析和方差分析（仅含固定效应）。

有限元方法代写

有限元方法（FEM）是一种流行的方法，用于数值解决工程和数学建模中出现的微分方程。典型的问题领域包括结构分析、传热、流体流动、质量运输和电磁势等传统领域。

有限元是一种通用的数值方法，用于解决两个或三个空间变量的偏微分方程（即一些边界值问题）。为了解决一个问题，有限元将一个大系统细分为更小、更简单的部分，称为有限元。这是通过在空间维度上的特定空间离散化来实现的，它是通过构建对象的网格来实现的：用于求解的数值域，它有有限数量的点。边界值问题的有限元方法表述最终导致一个代数方程组。该方法在域上对未知函数进行逼近。[1] 然后将模拟这些有限元的简单方程组合成一个更大的方程系统，以模拟整个问题。然后，有限元通过变化微积分使相关的误差函数最小化来逼近一个解决方案。

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

随机分析代写

随机微积分是数学的一个分支，对随机过程进行操作。它允许为随机过程的积分定义一个关于随机过程的一致的积分理论。这个领域是由日本数学家伊藤清在第二次世界大战期间创建并开始的。

时间序列分析代写

随机过程，是依赖于参数的一组随机变量的全体，参数通常是时间。随机变量是随机现象的数量表现，其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值（如1秒，5分钟，12小时，7天，1年），因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中，往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录，以得到其自身发展的规律。

回归分析代写

多元回归分析渐进（Multiple Regression Analysis Asymptotics）属于计量经济学领域，主要是一种数学上的统计分析方法，可以分析复杂情况下各影响因素的数学关系，在自然科学、社会和经济学等多个领域内应用广泛。

MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习和应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|机器学习代写machine learning代考|QBUS6850

Posted on 2023年8月23日2023年8月23日 by statistics-lab

计算机代写|机器学习代写machine learning代考|Error Correcting Output Coding (ECOC)

ECOC is an ensemble technique [4], in which multiple base classifiers are created and trained according to the information obtained from a pre-set binary code matrix. The main idea behind this procedure is to solve the original multi-class problem by combining the decision boundaries obtained from simpler two-class decompositions. The original problem is likely to be more complex compared to the subproblems into which it is decomposed, and therefore the aim is to come up with an easier and/or more accurate solution using the sub-problems rather than trying to solve it by a single complex classifier.

The base classifiers are actually two-class classifiers (dichotomizers), each of which is trained to solve a different bi-partitioning of the original problem. The bipartitions are created by combining the patterns from some predetermined classes together and relabeling them. An example bi-partitioning of an $N>2$ class dataset would be by having the patterns from the first 2 classes labeled as +1 and the last $N-2$ classes as -1 . The training patterns are therefore separated into two superclasses for each base classifier, and the information about how to create these superclasses is obtained from the ECOC matrix.

Consider an ECOC matrix $C$, where a particular element $C_{i j}$ is an element of the set $(+1,-1)$. Each $C_{i j}$ indicates the desired label for class $i$, to be used in training the base classifier $j$; and each row, called a codeword, represents the desired output for the whole set of base classifiers for the class it indicates. Figure 4.1 shows an ECOC matrix for a 4-class problem for illustration purposes.

计算机代写|机器学习代写machine learning代考|Bias and Variance Analysis

Bias and variance analysis plays an important role in ensemble classification research due to the framework it provides for classifier prediction error decomposition.

The estimator, on the other hand, is actually an averaged classifier model. It predicts a decision label by assuring the lowest expected loss over all labels that are created by classifiers trained on different training sets. The intrinsic parameters of these classifiers are usually the same, and the only difference is the training sets that they are trained on. In this case, instead of minimizing the loss over the target labels using the known underlying probability distribution as happens in the optimal classifier case, the minimization of the loss is carried out for the set of labels which are created by the classifiers trained on various training sets. If the decision of the estimator is named as the expected estimator decision $(E E D)$; then for a given test pattern $\left(x_i, l_i\right), E E D=\operatorname{argmin}_\alpha E_l[L(l, \alpha)]$ where $L$ denotes the loss function used, and $l$ the label values obtained from the classifiers used. For regression under the squared-error loss setting, the $O D$ is the mean of the target labels while the $E E D$ is the mean of the classifier decisions obtained via different training sets.

机器学习代考

计算机代写|机器学习代写machine learning代考|Error Correcting Output Coding (ECOC)

ECOC是一种集成技术[4]，它根据从预先设置的二进制码矩阵中获得的信息创建和训练多个基分类器。该过程的主要思想是通过结合由更简单的两类分解得到的决策边界来解决原来的多类问题。与被分解成的子问题相比，原始问题可能更复杂，因此目标是使用子问题提出更容易和/或更准确的解决方案，而不是试图通过单个复杂分类器来解决它。

基本分类器实际上是两类分类器(二分器)，每个分类器都被训练来解决原始问题的不同双划分。通过将来自某些预定类的模式组合在一起并重新标记它们来创建双分区。对$N>2$类数据集进行双分区的一个示例是，将前两个类中的模式标记为+1，将最后一个$N-2$类中的模式标记为-1。因此，对于每个基分类器，训练模式被分成两个超类，关于如何创建这些超类的信息是从ECOC矩阵中获得的。

考虑一个ECOC矩阵$C$，其中一个特定元素$C_{i j}$是集合$(+1,-1)$的一个元素。每个$C_{i j}$表示类$i$所需的标签，用于训练基分类器$j$;每一行称为一个码字，表示它所指示的类的整个基本分类器集的期望输出。为了便于说明，图4.1显示了一个4类问题的ECOC矩阵。

计算机代写|机器学习代写machine learning代考|Bias and Variance Analysis

偏差和方差分析为分类器预测误差分解提供了框架，在集成分类研究中起着重要的作用。

另一方面，估计器实际上是一个平均分类器模型。它通过确保在不同训练集上训练的分类器创建的所有标签上的最低预期损失来预测决策标签。这些分类器的内在参数通常是相同的，唯一的区别是它们所训练的训练集。在这种情况下，与在最优分类器情况下使用已知的潜在概率分布最小化目标标签上的损失不同，对由在各种训练集上训练的分类器创建的标签集进行最小化损失。如果估计器的决策被命名为预期估计器决策$(E E D)$;然后对于给定的测试模式$\left(x_i, l_i\right), E E D=\operatorname{argmin}_\alpha E_l[L(l, \alpha)]$，其中$L$表示使用的损失函数，$l$表示从使用的分类器获得的标签值。对于平方误差损失设置下的回归，$O D$为目标标签的均值，$E E D$为通过不同训练集得到的分类器决策的均值。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|机器学习代写machine learning代考|COMP4702

Posted on 2023年8月23日2023年8月23日 by statistics-lab

计算机代写|机器学习代写machine learning代考|UCI Data Experiments

The purpose of the experiments in this section is to compare the classification accuracy of MBDS, VMBDSs, eECOC, and OA ensembles on 15 UCI datasets [2]. Three types of classifiers were employed as binary base classifiers: the Ripper rule classifier [3], logistic regression [10], and Support Vector Machines [8]. The number of MBDS classifiers in the VMBDSs ensembles varied from 1 to 15 . The evaluation method was 10 -fold cross validation averaged over 10 runs. The results are given in Table 3.4, Table 3.5, and Table 3.6 in the Appendix. The classification accuracy of the classifiers was compared using the corrected paired t-test [15] at the 5\% significance level. Two types of t-test comparisons were realized: eECOC ensembles against all other ensembles and OA ensembles against all other ensembles.
The results in Tables 3.4-3.6 show that:

The difference in classification accuracy between the MBDS and eECOC ensembles is not statistically significant in 28 out of 45 experiments. In the remaining 17 experiments the classification accuracy of the MBDS ensembles is statistically lower than that of the eECOC ensembles.

The difference in classification accuracy between the MBDS and OA ensembles is not statistically significant in 34 out of 45 experiments. In the remaining 11 experiments the classification accuracy of the MBDS ensembles is statistically lower than that of the OA ensembles.

The classification accuracy of the VMBDSs ensembles varies between the accuracy of the MBDS ensembles and the accuracy of the eECOC ensembles. The difference of the classification accuracy of the worst VMBDSs ensembles and the eECOC ensembles is not statistically significant in 28 out of 45 experiments. In the remaining 17 experiments the classification accuracy of the worst VMBDSs ensembles is statistically lower than that of the eECOC ensembles. The difference of the classification accuracy of the best VMBDSs ensembles and the eECOC ensembles is not statistically significant in 44 out of 45 experiments. In the remaining one experiment the classification accuracy of the best VMBDSs ensembles is statistically greater than that of the eECOC ensembles.

The difference of the classification accuracy between the worst VMBDSs ensembles and the OA ensembles is not statistically significant in 34 out of 45 experiments. In the remaining 11 experiments the classification accuracy of the worst VMBDSs ensembles is statistically lower than that of the eECOC ensembles. The difference of the classification accuracy of the best VMBDSs ensembles and the OA ensembles is not statistically significant in 38 out of 45 experiments. In the next 6 (1) experiments the classification accuracy of the best VMBDSs ensembles is statistically greater (lower) than that of the OA ensembles. In addition we compare the VMBDSs and OA ensembles when they have an approximately equal number of binary classifiers. In this case we compare the VMBDSs ensembles using two MBDS classifiers with the OA ensembles. The results are that the difference of the classification accuracy of the VMBDSs and OA ensembles is not statistically significant in 41 out of 45 experiments. In the next 2 (2) experiments the classification accuracy of the VMBDSs ensembles is statistically greater (lower) than that of the OA ensembles.

计算机代写|机器学习代写machine learning代考|Experiments on Data Sets with Large Number of Classes

The purpose of this section’s experiments is to compare the classification accuracy of the VMBDSs and OA on three datasets with a large number of classes. The datasets chosen are Abalone [2], Patents [12], and Faces94 [11]. Several properties of these datasets are summarized in Table 3.1.

The eECOC ensembles were excluded from the experiments, since they require an exponential number of binary classifiers (in our experiments at least $2^{27}-1$ ). Support Vector Machines [8] were used as a base classifier. The number of MBDS classifiers in the VMBDSs ensembles was varied from 5 – 25. The evaluation method was 5 -fold cross validation averaged over 5 runs. The results are presented in Table 3.2. The classification accuracy of the classifiers is compared using the corrected paired t-test [15] at the 5\% significance level. The test compares the OA ensembles against all VMBDSs ensembles.

The experimental results from Table 3.2 show that the VMBDSs ensembles can outperform statistically the OA ensembles on these three datasets. In this respect it is important to know whether the VMBDSs ensembles outperform the OA ensembles when both types of ensembles contain the same number of binary classifiers; i.e., when their computational complexities are equal. We show how to organize this experiment for the Abalone dataset. This dataset has 28 classes. Thus, the number of binary classifiers in the OA ensemble is 28 . This implies that we have to find a configuration for the VMBDSs ensembles so that the total number of binary classifiers is close to 28 . In this context we note that the number of binary classifiers in each MBDS ensemble is $\left\lceil\log _2(28)\right\rceil=5$. Thus, in order to have close to 28 number of binary classifiers we need $\left\lfloor\frac{28}{5}\right\rfloor=5$ MBDS classifiers. According to Table 3.2 for this configuration the VMBDSs ensemble outperforms statistically the OA ensemble. Analogously we can do the same computation for the Patents and Faces 94 datasets: for the Patents dataset we need 10 MBDS classifiers and for the Faces 94 dataset we need 19 MBDS classifiers in the VMBDSs ensemble. According to Table 3.2 for these configurations the VMBDSs ensembles outperform statistically the OA ensemble.

机器学习代考

计算机代写|机器学习代写machine learning代考|UCI Data Experiments

本节实验的目的是比较MBDS、vmbds、eECOC和OA集合在15个UCI数据集上的分类准确率[2]。采用三种分类器作为二元基分类器:Ripper规则分类器[3]、逻辑回归[10]和支持向量机[8]。vmbds集成中MBDS分类器的数量从1到15不等。评价方法为10次平均交叉验证。结果见附录表3.4、表3.5、表3.6。在5%显著性水平下，使用校正成对t检验比较分类器的分类精度[15]。实现了两种类型的t检验比较:eECOC集成与所有其他集成以及OA集成与所有其他集成。
表3.4-3.6的结果表明:

在45个实验中，有28个实验MBDS和eECOC集合的分类准确率差异无统计学意义。在其余17个实验中，MBDS集合的分类精度在统计学上低于eECOC集合。

在45个实验中，有34个实验MBDS与OA的分类准确率差异无统计学意义。在其余11个实验中，MBDS集成集的分类精度在统计学上低于OA集成集。

vmbds系统的分类精度在MBDS系统和eECOC系统的分类精度之间存在差异。在45个实验中，28个实验中最差的vmbds集合与eECOC集合的分类准确率差异无统计学意义。在其余17个实验中，最差的VMBDSs集合的分类准确率在统计学上低于eECOC集合。在45个实验中，有44个实验的最佳VMBDSs集合与eECOC集合的分类准确率差异无统计学意义。在剩下的一个实验中，最佳的VMBDSs集合的分类精度在统计学上高于eECOC集合。

在45个实验中，有34个实验中最差的vmbds集合与OA集合的分类准确率差异无统计学意义。在其余11个实验中，最差的VMBDSs集合的分类精度在统计学上低于eECOC集合。在45个实验中，38个实验中最佳的vmbds集成与OA集成的分类准确率差异无统计学意义。在接下来的6(1)个实验中，最佳的vmbds集合的分类准确率在统计学上高于(低于)OA集合。此外，我们还比较了vmbds和OA集成，当它们具有近似相等数量的二元分类器时。在本例中，我们将使用两个MBDS分类器的vmbds集成与OA集成进行比较。结果表明，在45个实验中，有41个实验中vmbds与OA集合的分类准确率差异无统计学意义。在接下来的2(2)个实验中，VMBDSs集成集的分类精度在统计学上高于(低于)OA集成集。

计算机代写|机器学习代写machine learning代考|Experiments on Data Sets with Large Number of Classes

本节实验的目的是比较vmbds和OA在三个类数较多的数据集上的分类准确率。选择的数据集是Abalone[2]、Patents[12]和Faces94[11]。表3.1总结了这些数据集的几个属性。

eECOC集成被排除在实验之外，因为它们需要指数数量的二元分类器(在我们的实验中至少$2^{27}-1$)。支持向量机[8]被用作基础分类器。vmbds集合中MBDS分类器的数量从5 – 25个不等。评价方法为5次平均交叉验证。结果如表3.2所示。在5％显著性水平下，使用校正的配对t检验[15]比较分类器的分类精度。测试将OA集成与所有vmbds集成进行比较。

从表3.2的实验结果可以看出，在这三个数据集上，vmbds集成在统计上优于OA集成。在这方面，重要的是要知道当两种类型的集成包含相同数量的二进制分类器时，vmbds集成是否优于OA集成;也就是说，当它们的计算复杂度相等时。我们展示了如何为Abalone数据集组织这个实验。这个数据集有28个类。因此，OA集合中的二元分类器数量为28个。这意味着我们必须为vmbds集成找到一个配置，以便二进制分类器的总数接近28。在这种情况下，我们注意到每个MBDS集成中的二元分类器数量为$\left\lceil\log _2(28)\right\rceil=5$。因此，为了拥有接近28个二进制分类器，我们需要$\left\lfloor\frac{28}{5}\right\rfloor=5$ MBDS分类器。根据表3.2，在此配置中，vmbds集成在统计上优于OA集成。类似地，我们可以对Patents和Faces 94数据集进行相同的计算:对于Patents数据集，我们需要10个MBDS分类器，对于Faces 94数据集，我们需要vmbds集成中的19个MBDS分类器。从表3.2可以看出，在这些配置下，vmbds系统总体性能优于OA系统。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|深度学习代写deep learning代考|T81-558

Posted on 2023年8月22日2023年8月28日 by statistics-lab

如果你也在怎样代写深度学习Deep Learning 这个学科遇到相关的难题，请随时右上角联系我们的24/7代写客服。深度学习Deep Learning（是机器学习的一种，也是人工智能的一种方法。通过用简单的世界来构建复杂的世界表示，深度学习技术在计算机视觉、语音识别、自然语言处理、临床应用和其他领域取得了最先进的成果——有望(也有可能)改变社会。

深度学习Deep Learning架构，如深度神经网络、深度信念网络、深度强化学习、递归神经网络、卷积神经网络和变形金刚，已被应用于包括计算机视觉、语音识别、自然语言处理、机器翻译、生物信息学、药物设计、医学图像分析、气候科学、材料检测和棋盘游戏程序等领域，它们产生的结果与人类专家的表现相当，在某些情况下甚至超过了人类专家。

statistics-lab™ 为您的留学生涯保驾护航在代写深度学习deep learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写深度学习deep learning代写方面经验极为丰富，各种代写深度学习deep learning相关的作业也就用不着说。

计算机代写|深度学习代写deep learning代考|Knowledge Retrieval

A person’s knowledge store is vast. ${ }^{26}$ At any one moment, only a small number of all the knowledge elements in long-term memory are active. When a person is faced with a problem, particularly an unfamiliar one, every piece of his prior knowledge is potentially relevant. To be applied to the situation at hand, a knowledge element has to be retrieved through spread of activation. ${ }^{27}$ To visualize this process, it is useful to conceptualize long-term memory as a network in which knowledge elements are nodes and the relations between them are links. Each node is associated with a level of activation that fluctuates over time. At each moment in time, a small subset of the nodes have activation levels above a threshold. Those elements are immediately available for processing; they form the current content of working memory. Activation is passed along the links from elements that are currently above threshold to other, related but not yet active nodes. If a knowledge element receives enough activation to rise above threshold, it “comes to mind” as we say. “Retrieval” is a label for the event that occurs when the activation of a knowledge element rises above threshold. As activation spreads from a source node $N$, it is passed along its outbound links. A certain amount of activation is lost in each step of the spreading process, so the amount that is spread from a given source node $N$ decreases gradually with increased distance from $N$. There are several variants of this theory that differ in the quantitative details of the spreading process, but those details need not concern us here.

Memory retrieval is selective. A person can keep only a small amount of information in an active state at any one time – working memory has a limited capacity – but the knowledge store is vast, so the retrieval process necessarily makes choices, however implicit and unconscious, about what to retrieve. Retrieving an element $X$ constrains which other elements can also be retrieved at the same time. In a problem situation, activation initially spreads from the problem representation and the goal. The initial encounter with the problem thus determines the knowledge elements that are initially marshaled to solve it, and those elements in turn become sources from which activation spreads. Retrieval is a cyclic, iterative search through memory for those knowledge elements that are most likely to be relevant for the problem at hand.

计算机代写|深度学习代写deep learning代考|Anticipation

When we perceive an object, we register certain actions that we can perform vis-à-vis that object and certain functions that the object can perform for us. Seeing a chair, the thought of sitting down is not far away; seeing a soccer ball on a lawn, the thought of kicking it not only comes to mind but is hard to resist. Children need no instruction to figure out that pebbles on a beach can be thrown into the sea. In general, the mere perception of an object is sufficient to activate certain actions and dispositions vis-à-vis that object. The opportunities for action associated with an object are called the affordances of that object. ${ }^{30}$ We can think about overt actions without performing them, so we must possess mental representations of them.

Goals likewise suggest to us actions that accomplish them. The need to fasten two things to each other prompts us to think about gluing, nailing, taping and tying. The need to reach something on a high shelf makes us look for a box that is sturdy enough to stand on, a chair, footstool, ladder or some other means of increasing our height. Failing to find one, we might look for a broom handle or some other way to increase our reach. In short, both the current state of affairs and the goal can serve as memory probes that retrieve actions more precisely, to retrieve mental representations of actions.

In their 1972 treatise on analytical thinking, Human Problem Solving, Herbert A. Simon and Allen Newell emphasized that thinking is anticipatory. ${ }^{31}$ The representation of a familiar action contains knowledge that enables a person to execute it but also to anticipate its effects. To think about a problem is to imagine the outcomes of the possible actions before they are carried out: If I do this or that, the result, the new state of affairs, will be such-and-such. This mental look-ahead process allows us to evaluate the promise and usefulness of actions ahead of time. For example, in playing a board game like chess, a player will imagine making a move, anticipate how the relations on the board would change if he were to make that move, and use that anticipated outcome to evaluate the move before deciding what to do. Another commonplace example is to think through the effects of moving the sofa in one’s living room to the other side of the room before taking the trouble of moving it physically (if we put the sofa there, there is no place for the end table). Look-ahead is quicker, requires less effort, allows us to explore mutually exclusive options (buy house $X$ vs. buy house $Y$ ) and saves us, when consequences are costly, from having to pay the price of our poor judgment. To think analytically is, in part, to carry out actions in the mind before we carry them out in the flesh.

深度学习代写

计算机代写|深度学习代写deep learning代考|Knowledge Retrieval

一个人的知识储备是巨大的。${}^{26}$在任何时刻，长时记忆中只有一小部分的知识元素是活跃的。当一个人面对一个问题，特别是一个不熟悉的问题时，他之前的每一个知识都可能是相关的。要将知识元素应用于手头的情况，必须通过激活的扩展来检索知识元素。${}^{27}$为了可视化这个过程，将长期记忆概念化为一个网络是有用的，在这个网络中，知识元素是节点，它们之间的关系是链接。每个节点都与一个随时间波动的激活水平相关联。在每个时刻，都有一小部分节点的激活水平高于阈值。这些元素可以立即进行处理;它们构成了工作记忆的当前内容。激活沿着链接从当前高于阈值的元素传递到其他相关但尚未活动的节点。如果一个知识元素得到了足够的激活，超过了阈值，它就像我们说的那样“浮现在脑海中”。“检索”是当知识元素的激活超过阈值时发生的事件的标签。当激活从源节点$N$传播时，它将沿着其出站链接传递。在扩散过程的每一步中都会损失一定的激活量，因此从给定源节点$N$传播的激活量随着距离$N$的增加而逐渐减少。这个理论有几种变体，它们在传播过程的定量细节上有所不同，但这些细节我们在这里不需要关心。

记忆检索是选择性的。一个人在任何时候都只能在活动状态下保留少量的信息——工作记忆的容量有限——但知识存储是巨大的，因此检索过程必然会做出选择，无论多么隐式和无意识地选择要检索的内容。检索元素$X$限制了可以同时检索的其他元素。在问题情境中，激活最初是从问题表示和目标传播的。因此，最初遇到的问题决定了最初为解决问题而编组的知识元素，而这些元素反过来又成为激活传播的来源。检索是在内存中循环、迭代地搜索那些最有可能与当前问题相关的知识元素。

计算机代写|深度学习代写deep learning代考|Anticipation

当我们感知一个物体时，我们会记录我们可以对-à-vis这个物体执行的某些动作以及这个物体可以为我们执行的某些功能。看到一把椅子，想坐下来的念头就在不远处;看到草坪上的足球，踢它的想法不仅出现在脑海中，而且难以抗拒。孩子们不需要任何指导就能明白沙滩上的鹅卵石可以扔进海里。一般来说，仅仅对一个物体的感知就足以激活对-à-vis该物体的某些行为和倾向。与一个对象相关的行动机会称为该对象的能力。${}^{30}$我们可以思考公开的行为而不去执行它们，所以我们必须拥有它们的心理表征。

同样，目标也向我们提出了实现目标的行动。需要把两件东西绑在一起，这促使我们考虑用胶水、钉子、胶带和打结。为了够到高架子上的东西，我们会寻找一个足够坚固的盒子，椅子，脚凳，梯子或其他增加我们高度的方法。如果找不到，我们可能会寻找扫帚柄或其他方法来扩大我们的范围。简而言之，当前状态和目标都可以作为记忆探针，更精确地检索行为，检索行为的心理表征。

在他们1972年关于分析思维的论文《人类问题解决》中，赫伯特·a·西蒙和艾伦·纽维尔强调，思维是预期的。${}^{31}$熟悉动作的表示包含使人能够执行该动作并预测其效果的知识。思考一个问题就是在可能的行动实施之前想象它们的结果:如果我做这个或那个，结果，即新的事态，将是这样那样的。这种心理上的前瞻性过程使我们能够提前评估行动的前景和有用性。例如，在玩象棋之类的棋盘游戏时，玩家会想象要走一步棋，预测如果他要走这一步棋，棋盘上的关系会发生什么变化，并在决定怎么做之前使用预期结果来评估这一步棋。另一个常见的例子是，在麻烦地把客厅里的沙发搬到房间的另一边之前，仔细考虑一下把沙发搬到房间的另一边会产生什么影响(如果我们把沙发放在那里，就没有地方放茶几了)。前瞻性更快，需要更少的努力，允许我们探索相互排斥的选择(买房子X美元或买房子Y美元)，并在后果昂贵时拯救我们，使我们不必为我们的错误判断付出代价。在某种程度上，分析性思考就是在我们肉体行动之前先在头脑中行动。

计算机代写|深度学习代写deep learning代考请认准statistics-lab™

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|深度学习代写deep learning代考|CSCI5922

Posted on 2023年8月22日2023年8月28日 by statistics-lab

计算机代写|深度学习代写deep learning代考|FRAMING THE PROBLEM OF INSIGHT

To study insight, researchers need a technique that allows them to reliably produce such events. Historical and biographical studies of the Edisons, the Einsteins and the Michelangelos of human history are limited by the fact that there are only a few of them and their appearance is unrelated to the investigator’s needs. We cannot whistle up a creative genius whenever a hypothesis about creativity is ready to be tested. In addition, the historical records of great achievements seldom allow us to follow the thought processes at fine enough a temporal grain. If a scientist writes in his laboratory notebook three times a week, and a new idea forms in the course of a few minutes, then that notebook is too blunt an instrument with which to grab hold of the creative act. It is possible but impractical for an observer to follow the activities in, for example, a chemical laboratory for a year or a decade, in the hope that a Lavoisier or a Pauling will emerge in that place during that time. ${ }^3$ Another problem is that field studies of creative projects do not allow the psychologist to study creativity under systematically varied conditions, a powerful investigatory technique.

The alternative is to conduct experiments. I use the word broadly to refer to any study in which the researcher deliberately arranges for certain events to take place for the purpose of observing them. To conduct an experiment on creativity is to ask a person to solve some problem that requires a novel response under conditions that allow his behavior to be recorded for later analysis. The key step in conducting such an experiment is to choose a suitable problem.

计算机代写|深度学习代写deep learning代考|The Case Against Insight Problems

These problems tend to share the following features: (a) They can be stated concisely. The given materials encompass only a few simple objects, if any at all. The task instructions and the goal can be summarized in a handful of sentences. (b) The solutions, once thought of, take only a few seconds to complete, and they consist of no more than a handful of actions or inferences. They require no exotic skills but only actions that any normal adult is familiar with, such as drawing a line on a piece of paper, cutting something with a pair of scissors, tying a knot, multiplying a number by a small integer and so on. Some insight problems require no physical actions at all, merely an inference or two. The difficulty lies solely in thinking of the right action or inference. (c) The problems are difficult in the sense that the time required by adults of average intelligence to think of the solution is out of proportion to the time it takes to execute the solution, once thought of.

For example, the Two-String Problem poses the task of tying together two ropes hanging from the ceiling at such a distance that a person cannot reach the second rope while holding on to the first. ${ }^4$ The solution is to tie some heavy object to the second rope and set it swinging, walk back to the first rope and grab it, and grab the second rope at the end of its swing. The Hat-Rack Problem requires the problem solver to build a hat rack out of three 2-by-4 wooden boards and a C-clamp. 5 The solution is to wedge two boards between floor and ceiling, keep them in place with the clamp and hang the hat on the clamp handle. Other problems are more geometrical than practical. For example, the Nine Dot Problem poses the task of drawing four straight lines through nine dots, arranged in a square, without lifting the pen and without retracing. ${ }^6$ Yet others are verbal riddles. The B.c. Coin Problem is an example: A man enters a rare coin shop proposing to sell what appears to be a Roman coin with an emperor’s head on one side and the text ” 150 B.C.” on the other. The store owner immediately phones the police. Why? ${ }^7$ It usually takes a couple of minutes before a college student spots the anomaly that a coin maker could not have known himself to be living in the year 150 B.C. (Before Christ). Yet others are numerical puzzles. In the Lily Pond Problem, the question is this: If the lilies on a pond cover 1.6\% of the pond surface and grow so fast that they double the area they cover every 24 hours, and the pond surface is 1.2 acres, how much of the surface of the pond do they cover the day before they cover the whole pond?

深度学习代写

计算机代写|深度学习代写deep learning代考|FRAMING THE PROBLEM OF INSIGHT

为了研究洞察力，研究人员需要一种技术，使他们能够可靠地产生这样的事件。对人类历史上的爱迪生、爱因斯坦和米开朗基罗的历史和传记研究受到限制，因为他们只有少数几个人，而且他们的外表与研究者的需要无关。每当一个关于创造力的假设准备好接受检验时，我们就不能吹嘘出一个创造性天才。此外，关于伟大成就的历史记录很少允许我们在足够精细的时间粒度上跟踪思维过程。如果一个科学家每周在他的实验室笔记本上写三次，而一个新的想法在几分钟内就形成了，那么这个笔记本就太钝了，无法抓住创造性的行为。这是可能的，但不现实的观察者跟踪活动，例如，在一个化学实验室一年或十年，希望拉瓦锡或鲍林会出现在那个地方在那段时间。另一个问题是，创造性项目的实地研究不允许心理学家在系统变化的条件下研究创造性，而这是一种强有力的调查技术。

另一种选择是进行实验。我广义地使用这个词是指研究者为了观察而故意安排某些事件发生的任何研究。进行一个关于创造力的实验，就是要求一个人在允许他的行为被记录下来供以后分析的条件下，解决一些需要新颖反应的问题。进行这种实验的关键步骤是选择一个合适的问题。

计算机代写|深度学习代写deep learning代考|The Case Against Insight Problems

这些问题往往具有下列特点:(a)可以简明扼要地加以说明。给定的材料只包含几个简单的对象，如果有的话。任务说明和目标可以用几句话概括。(b)一旦想到解决办法，只需几秒钟就能完成，而且它们不超过几个行动或推论。他们不需要任何特殊的技能，只需要任何正常成年人熟悉的动作，比如在纸上画一条线，用剪刀剪东西，打结，把一个数字乘以一个小整数等等。有些顿悟问题根本不需要身体上的动作，只需要一两个推理。困难仅仅在于思考正确的行动或推论。(c)这些问题之所以困难，是因为具有一般智力的成年人想出解决办法所需的时间与一旦想出办法就付诸实施所需的时间不成比例。

例如，“两根绳子问题”要求把悬挂在天花板上的两根绳子绑在一起，绳子的距离要保证一个人在抓住第一根绳子的情况下无法够到第二根绳子。${}^4$解决办法是在第二根绳子上绑一些重物并让它摇摆，走回第一根绳子并抓住它，并在摆动结束时抓住第二根绳子。帽架问题要求解决问题的人用3块2乘4的木板和一个c形夹做一个帽架。5 .解决方法是在地板和天花板之间楔入两块木板，用夹子固定住，然后把帽子挂在夹子的把手上。其他问题更多的是几何问题，而不是实际问题。例如，“九点问题”要求在不起笔、不折回的情况下，通过排列成正方形的九个点画四条直线。${}^6$还有一些是口头谜语。公元前硬币问题就是一个例子:一名男子走进一家罕见的硬币店，打算出售一枚一面有皇帝头像，另一面写着“公元前150年”的罗马硬币。店主立即打电话报警。为什么?${}^7$通常需要几分钟的时间，大学生才会发现硬币制造者不可能知道自己生活在公元前150年(公元前150年)。还有一些则是数字难题。在百合池塘问题中，问题是这样的:如果池塘里的百合覆盖了池塘表面的1.6%，并且生长得很快，每24小时它们覆盖的面积就会翻一番，池塘表面是1.2英亩，那么在它们覆盖整个池塘之前，它们覆盖了池塘表面的多少?

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|深度学习代写deep learning代考|COMP5329

Posted on 2023年8月22日2023年8月28日 by statistics-lab

计算机代写|深度学习代写deep learning代考|Four Creativity Questions

A satisfactory theory of creativity should, first, explain how the production of novelty is possible in principle by proposing cognitive processes that are sufficient to generate novelty, and, second, explain what is creative about those processes by contrasting them with the processes behind noncreative, analytical solutions to problems. Third, such a theory should explain what gives direction to the creative process. Fourth, it should explain not only how it is possible to produce novelty but also why it is difficult to do so. These four questions are criteria of adequacy against which theories of the production of novelty can be evaluated.

One can produce something novel by combining entities in some way in which they have not been combined before. Neither the things combined nor the type of combination need to be novel for the result to be novel. The familiarity and ordinariness of this principle veil its depth and importance. The explanatory power of combinatorial processes resides in the disparity between the simplicity of their inputs and the magnitude of their outputs. ${ }^{23}$ Five components chosen among no more than 10 potential components and related pairwise via any one of three distinct types of connections – a rather modest generative mechanism – can produce more than 1.7 billion possible combinations. In general, combinatorial processes generate vast possibility spaces. The size of such a space increases rapidly with increases in the number of potential components and potential connections, a phenomenon known as the combinatorial explosion. The phenomenon is more familiar than the unfamiliar term suggests. The 26 letters in the alphabet suffice to spell the approximately 100,000 words in the English language, as well as the many thousands of words yet to be invented. The generativity of combination is also the secret behind the infinite diversity of the material world, with atoms playing the role of parts and molecules being the combinations. Billions of possibilities might suffice to encompass even such ill-defined but vast possibility spaces as all possible paintings and all possible machines.

计算机代写|深度学习代写deep learning代考|Novelty Through Accumulation

Novelty can arise through the accumulation of work. When a familiar step is executed as part of a sequence of steps, it might interact with other steps in such a way that the final outcome is novel. Chess provides an example. The individual moves in a chess game are guaranteed to be familiar, because the legal moves are specified by the rules of the game. Nevertheless, every chess match unfolds differently from any match played before it, and the particular sequence of moves played in a particular match might be called creative by chess experts. Similarly, a mathematical theorem can be novel even though every step in its proof consists of a familiar algebraic transformation. Each step along the solution path contributes some small amount to the gradually widening gap between the product under construction and prior products of the same type. Cognitive psychologist Robert W. Weisberg puts this position as follows:
On the whole, it seems reasonable to conclude that people create solutions to new problems by starting with what they know and later modify it to meet the specific problem at hand. At every step of the way, the process involves a small movement away from what is known, and even these small tentative movements into the unknown are firmly anchored in past experience. ${ }^{29}$
The accumulation-of-work principle has the advantage of concurring with two of the most salient facts about the production of novelty: Creative individuals tend to work hard and invest enormous effort into their projects. Furthermore, it is a matter of historical record that many famous novelties in the areas of art, science and technology emerged out of processes that lasted orders of magnitude longer than the time it takes to have a new idea.

深度学习代写

计算机代写|深度学习代写deep learning代考|Four Creativity Questions

一个令人满意的创造力理论应该，首先，通过提出足以产生新颖性的认知过程来解释新颖性的产生在原则上是如何可能的;其次，通过将这些过程与问题的非创造性分析解决方案背后的过程进行对比，来解释这些过程中什么是创造性的。第三，这种理论应该解释是什么给创造过程提供了方向。第四，它不仅应该解释如何可能产生新颖性，而且还应该解释为什么很难做到这一点。这四个问题是评估新颖性产生理论的充分性标准。

人们可以通过以某种方式组合以前从未组合过的实体来产生一些新奇的东西。不管是组合的事物还是组合的类型都不需要是新奇的，结果才会是新奇的。这一原则的熟悉和平凡掩盖了它的深度和重要性。组合过程的解释力在于其输入的简单性和输出的巨大性之间的差异。${}^{23}$从不超过10个潜在成分中选择5个成分，并通过三种不同类型的连接中的任何一种进行配对——一种相当适度的生成机制——可以产生超过17亿种可能的组合。一般来说，组合过程产生巨大的可能性空间。这种空间的大小随着潜在成分和潜在连接数量的增加而迅速增加，这种现象被称为组合爆炸。这种现象比这个不熟悉的术语所暗示的更为熟悉。字母表中的26个字母足以拼出大约10万个英语单词，以及数千个尚未被发明的单词。组合的生成性也是物质世界无限多样性背后的秘密，原子扮演着零件的角色，分子是组合的角色。数十亿的可能性足以包含像所有可能的绘画和所有可能的机器这样不明确但巨大的可能性空间。

计算机代写|深度学习代写deep learning代考|Novelty Through Accumulation

新颖性可以通过工作的积累而产生。当一个熟悉的步骤作为一系列步骤的一部分执行时，它可能会以一种新的方式与其他步骤交互，从而产生新的结果。国际象棋就是一个例子。棋局中的每一步棋都是大家熟悉的，因为规则规定了合法的走法。然而，每一场国际象棋比赛的展开都不同于之前的任何一场比赛，在一场特定的比赛中，下棋的特定顺序可能被国际象棋专家称为创造性。同样，一个数学定理可以是新颖的，即使它的证明的每一步都是由一个熟悉的代数变换组成的。沿着解决路径的每一步都对正在构建的产品与先前相同类型的产品之间逐渐扩大的差距有一定的贡献。认知心理学家Robert W. Weisberg将这一观点阐述如下:
总的来说，得出这样的结论似乎是合理的:人们从他们所知道的开始，然后对其进行修改，以满足手头的具体问题，从而创造新问题的解决方案。在前进的每一步中，这个过程都涉及到远离已知事物的一小步，即使是这些进入未知事物的尝试性小步，也牢牢地扎根于过去的经验中。$ {} ^ {29} $
工作积累原则的优势在于，它与创造新奇事物的两个最显著的事实相吻合:富有创造力的人往往会努力工作，并在他们的项目中投入巨大的精力。此外，历史记录表明，在艺术、科学和技术领域，许多著名的新奇事物都是在比产生一个新想法所需的时间长几个数量级的过程中产生的。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|机器学习代写machine learning代考|STAT3888

Posted on 2023年8月16日2023年8月28日 by statistics-lab

计算机代写|机器学习代写machine learning代考|UCI Categorization

The classification results obtained for all the UCI data sets considering the different ECOC configurations are shown in Table 2.2. In order to compare the performances provided for each strategy, the table also shows the mean rank of each ECOC design considering the twelve different experiments. The rankings are obtained estimating each particular ranking $r_i^j$ for each problem $i$ and each ECOC configuration $j$, and computing the mean ranking $R$ for each design as $R_j=\frac{1}{N} \sum_i r_i^j$, where $N$ is the total number of data sets. We also show the mean number of classifiers (#) required for each strategy.

In order to analyze if the difference between ranks (and hence, the methods) is statistically significant, we apply a statistical test. In order to reject the null hypothesis (which implies no significant statistical difference among measured ranks and the mean rank), we use the Friedman test. The Friedman statistic value is computed as follows:
$$
X_F^2=\frac{12 N}{k(k+1)}\left[\sum_j R_j^2-\frac{k(k+1)^2}{4}\right] .
$$
In our case, with $k=4$ ECOC designs to compare, $X_F^2=-4.94$. Since this value is rather conservative, Iman and Davenport proposed a corrected statistic:
$$
F_F=\frac{(N-1) X_F^2}{N(k-1)-X_F^2}
$$

Applying this correction we obtain $F_F=-1.32$. With four methods and twelve experiments, $F_F$ is distributed according to the $F$ distribution with 3 and 33 degrees of freedom. The critical value of $F(3,33)$ for 0.05 is 2.89 . As the value of $F_F$ is no higher than 2.98 we can state that there is no statistically significant difference among the ECOC schemes. This means that all four strategies are suitable in order to deal with multi-class categorization problems. This result is very satisfactory and encourages the use of the compact approach since similar (or even better) results can be obtained with far less number of classifiers. Moreover, the GA evolutionary version of the compact design improves in the mean rank to the rest of classical coding strategies, and in most cases outperforms the binary compact approach in the present experiment. This result is expected since the evolutionary version looks for a compact ECOC matrix configuration that minimizes the error over the training data. In particular, the advantage of the evolutionary version over the binary one is more significant when the number of classes increases, since more compact matrices are available for optimization.

计算机代写|机器学习代写machine learning代考|Labelled Faces in the Wild Categorization

This dataset contains 13000 faces images taken directly from the web from over 1400 people. These images are not constrained in terms of pose, light, occlusions or any other relevant factor. For the purpose of this experiment we used a specific subset, taking only the categories which at least have four or more examples, having a total of 610 face categories. Finally, in order to extract relevant features from the images, we apply an Incremental Principal Component Analysis procedure [16], keeping $99.8 \%$ of the information. An example of face images is shown in Fig. 2.4.
The results in the first row of Table 2.3 show that the best performance is obtained by the Evolutionary GA and PBIL compact strategies. One important observation is that Evolutionary strategies outperform the classical one-versus-all approach, with far less number of classifiers (10 instead of 610). Note that in this case we omitted the one-vs-one strategy since it requires 185745 classifiers for discriminating 610 face categories.

For this second computer vision experiment, we use the video sequences obtained from the Mobile Mapping System of [1] to test the ECOC methodology on a real traffic sign categorization problem. In this system, the position and orientation of the different traffic signs are measured with video cameras fixed on a moving vehicle. The system has a stereo pair of calibrated cameras, which are synchronized with a GPS/INS system. The result of the acquisition step is a set of stereo-pairs of images with their position and orientation information. From this system, a set of 36 circular and triangular traffic sign classes are obtained. Some categories from this data set are shown in Fig. 2.5. The data set contains a total of 3481 samples of size $32 \times 32$, filtered using the Weickert anisotropic filter, masked to exclude the background pixels, and equalized to prevent the effects of illumination changes. These feature vectors are then projected into a 100 feature vector by means of PCA.

The classification results obtained when considering the different ECOC configurations are shown in the second row of Table 2.3. The ECOC designs obtain similar classification results with an accuracy of over $90 \%$. However, note that the compact methodologies use six times less classifiers than the one-versus-all and 105 less times classifiers than the one-versus-one approach, respectively.

机器学习代考

计算机代写|机器学习代写machine learning代考|UCI Categorization

考虑不同ECOC配置的所有UCI数据集的分类结果如表2.2所示。为了比较每种策略提供的性能，下表还显示了考虑到12种不同实验的每种ECOC设计的平均排名。通过估计每个问题$i$和每个ECOC配置$j$的每个特定排名$r_i^j$得到排名，并计算每个设计的平均排名$R$为$R_j=\frac{1}{N} \sum_i r_i^j$，其中$N$为数据集的总数。我们还展示了每种策略所需的分类器的平均数量(＃)。

为了分析等级之间的差异(以及方法之间的差异)是否具有统计显著性，我们应用了统计检验。为了拒绝零假设(这意味着测量秩和平均秩之间没有显著的统计差异)，我们使用弗里德曼检验。弗里德曼统计值计算公式如下:
$$
X_F^2=\frac{12 N}{k(k+1)}\left[\sum_j R_j^2-\frac{k(k+1)^2}{4}\right] .
$$
在我们的案例中，与$k=4$ ECOC设计进行比较，$X_F^2=-4.94$。由于这个值相当保守，Iman和Davenport提出了一个修正后的统计:
$$
F_F=\frac{(N-1) X_F^2}{N(k-1)-X_F^2}
$$

应用这个修正，我们得到$F_F=-1.32$。通过4种方法和12个实验，$F_F$按照$F$的3自由度和33自由度分布进行分布。$F(3,33)$对0.05的临界值为2.89。由于$F_F$的值不大于2.98，我们可以认为ECOC方案之间没有统计学上的显著差异。这意味着这四种策略都适用于处理多类分类问题。这个结果非常令人满意，并鼓励使用紧凑方法，因为使用更少的分类器可以获得类似(甚至更好)的结果。此外，遗传进化版本的紧凑设计在平均秩上优于其他经典编码策略，并且在大多数情况下优于本实验中的二进制紧凑方法。这个结果是预期的，因为进化版本寻找一个紧凑的ECOC矩阵配置，使训练数据上的误差最小化。特别是，当类的数量增加时，进化版本相对于二进制版本的优势更加显著，因为可以使用更紧凑的矩阵进行优化。

计算机代写|机器学习代写machine learning代考|Labelled Faces in the Wild Categorization

该数据集包含13000张直接从网络上取自1400多人的人脸图像。这些图像不受姿势、光线、遮挡或任何其他相关因素的限制。为了这个实验的目的，我们使用了一个特定的子集，只取至少有四个或更多例子的类别，总共有610个面部类别。最后，为了从图像中提取相关特征，我们应用增量主成分分析程序[16]，保留99.8%的信息。一个人脸图像的例子如图2.4所示。
表2.3第一行的结果表明，进化遗传算法和PBIL压缩策略的性能最好。一个重要的观察结果是，进化策略优于经典的“一对全”方法，它的分类器数量要少得多(10个而不是610个)。注意，在这种情况下，我们省略了一对一策略，因为它需要185745个分类器来区分610个人脸类别。

对于第二个计算机视觉实验，我们使用从[1]的移动地图系统获得的视频序列来测试ECOC方法在实际交通标志分类问题上的应用。在这个系统中，不同的交通标志的位置和方向是通过固定在移动车辆上的摄像机来测量的。该系统有一对立体校准相机，与GPS/INS系统同步。采集步骤的结果是一组具有位置和方向信息的立体图像对。从这个系统中，得到了一组36个圆形和三角形交通标志类。该数据集中的一些类别如图2.5所示。该数据集共包含3481个大小为$32 × 32$的样本，使用Weickert各向异性滤波器进行滤波，屏蔽以排除背景像素，并进行均衡以防止光照变化的影响。然后通过PCA将这些特征向量投影成100个特征向量。

考虑不同ECOC配置得到的分类结果如表2.3第二行所示。ECOC设计获得了类似的分类结果，准确率超过90%。但是，请注意，紧凑方法使用的分类器比单对全方法少6倍，比单对一方法少105倍。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|机器学习代写machine learning代考|QBUS3820

Posted on 2023年8月16日2023年8月28日 by statistics-lab

计算机代写|机器学习代写machine learning代考|Evolutionary Compact Parametrization

When defining a compact design of an ECOC, the possible loss of generalization performance has to be taken into account. In order to deal with this problem an evolutionary optimization process is used to find a compact ECOC with high generalization capability.

In order to show the parametrization complexity of the compact ECOC design, we first provide an estimation of the number of different possible ECOC matrices that we can build, and therefore, the search space cardinality. We approximate this number using some simple combinatorial principles. First of all, if we have an $N$-class problem and $B$ possible bits to represent all the classes, we have a set $C W$ with $2^B$ different words. In order to build an ECOC matrix, we select $N$ codewords from $C W$ without replacement. In combinatorics this is represented as $\left(\begin{array}{c}2_N^B \ N\end{array}\right)$, which means that we can construct $V_{2^B}^N=\frac{2^{B} !}{\left(2^B-N\right) !}$ different ECOC matrices. Nevertheless, in the ECOC framework, one matrix and its opposite (swapping all zeros by ones and vice-versa) are considered as the same matrix, since both represent the same partitions of the data. Therefore, the approximated number of possible ECOC matrices with the minimum number of classifiers is $\frac{V_{2 B}^N}{2}=\frac{2^{B} !}{2\left(2^B-N\right) !}$. In addition to the huge cardinality, it is easy to show that this space is neither continuous nor differentiable, because a change in just one bit of the matrix may produce a wrong coding design.

In this type of scenarios, evolutionary approaches are often introduced with good results. Evolutionary algorithms are a wide family of methods that are inspired on the Darwin’s evolution theory, and used to be formulated as optimization processes where the solution space is neither differentiable nor well defined. In these cases, the simulation of natural evolution process using computers results in stochastic optimization techniques which often outperform classical methods of optimization when applied to difficult real-world problems. Although the most used and studied evolutionary algorithms are the Genetic Algorithms (GA), from the publication of the Population Based Incremental Learning (PBIL) in 1995 by Baluja and Caruana [4], a new family of evolutionary methods is striving to find a place in this field. In contrast to $\mathrm{GA}$, those new algorithms consider each value in the chromosome as a random variable, and their goal is to learn a probability model to describe the characteristics of good individuals. In the case of PBIL, if a binary chromosome is used, a uniform distribution is learned in order to estimate the probability of each variable to be one or zero.

In this chapter, we report experiments made with the selected evolutionary strategies – i.e. GA and PBIL. Note that for both Evolutionary Strategies, the encoding step and the adaptation function are exactly equivalent.

计算机代写|机器学习代写machine learning代考|Problem encoding

Problem encoding: The first step in order to use an evolutionary algorithm is to define the problem encoding, which consists of the representation of a certain solution or point in the search space by means of a genotype or alternatively a chromosome [14]. When the solutions or individuals are transformed in order to be represented in a chromosome, the original values (the individuals) are referred as phenotypes, and each one of the possible settings for a phenotype is the allele. Binary encoding is the most common, mainly because the first works about GA used this type of encoding. In binary encoding, every chromosome is a string of bits. Although this encoding is often not natural for many problems and sometimes corrections must be performed after crossover and/or mutation, in our case, the chromosomes represent binary ECOC matrices, and therefore, this encoding perfectly adapts to the problem. Each ECOC is encoded as a binary chromosome $\zeta=$, where $h_i^{c_j} \in{0,1}$ is the expected value of the $i$-th classifier for the class $c_j$, which corresponds to the $i-t h$ bit of the class $c_j$ codeword.

Adaptation function: Once the encoding is defined, we need to define the adaptation function, which associates to each individual its adaptation value to the environment, and thus, their survival probability. In the case of the ECOC framework, the adaptation value must be related to the classification error.

Given a chromosome $\zeta=\left\langle\zeta_0, \zeta_1, \ldots, \zeta_L>\right.$ with $\zeta_i \in{0,1}$, the first step is to recover the ECOC matrix $M$ codified in this chromosome. The elements of $M$ allow to create binary classification problems from the original multi-class problem, following the partitions defined by the ECOC columns. Each binary problem is addressed by means of a binary classifier, which is trained in order to separate both partitions of classes. Assuming that there exists a function $y=f(x)$ that maps each sample $x$ to its real label $y$, training a classifier consists of finding the best parameters $w^$ of a certain function $y=f^{\prime}(x, w)$, in the manner that for any other $w \neq w^, f^{\prime}\left(x, w^\right)$ is a better approximation to $f$ than $f^{\prime}(x, w)$. Once the $w^$ are estimated for each binary problem, the adaptation value corresponds to the classification error. In order to take into account the generalization power of the trained classifiers, the estimation of $w^*$ is performed over a subset of the samples, while the rest of the samples are reserved for a validation set, and the adaptation value $\xi$ is the classification error over that validation subset. The adaptation value for an individual represented by a certain chromosome $\zeta_i$ can be formulated as:
$$
\varepsilon_i\left(P, Y, M_i\right)=\frac{\sum_{j=1}^s \delta\left(\rho_j, M_i\right) \neq y_j}{s},
$$
where $M_i$ is the ECOC matrix encoded in $\zeta_i, P=\left\langle\rho_1, \ldots, \rho_s\right\rangle$ a set of samples, $Y=\left\langle y_1, \ldots, y_s\right\rangle$ the expected labels for samples in $P$, and $\delta$ is the function that returns the classification label applying the decoding strategy.

机器学习代考

计算机代写|机器学习代写machine learning代考|Evolutionary Compact Parametrization

在定义ECOC的紧凑设计时，必须考虑到可能的泛化性能损失。为了解决这一问题，采用进化优化方法寻找具有高泛化能力的紧凑ECOC。

为了显示紧凑ECOC设计的参数化复杂性，我们首先提供了我们可以构建的不同可能ECOC矩阵的数量的估计，从而提供了搜索空间基数。我们用一些简单的组合原理来近似这个数字。首先，如果我们有一个$N$ -class问题，并且有$B$个可能的位来表示所有的class，那么我们就有一个包含$2^B$个不同单词的集合$C W$。为了构建ECOC矩阵，我们从$C W$中选择$N$码字而不进行替换。在组合学中，这表示为$\left(\begin{array}{c}2_N^B \ N\end{array}\right)$，这意味着我们可以构造$V_{2^B}^N=\frac{2^{B} !}{\left(2^B-N\right) !}$不同的ECOC矩阵。然而，在ECOC框架中，一个矩阵和它的对立面(用1交换所有零，反之亦然)被认为是相同的矩阵，因为两者都表示数据的相同分区。因此，具有最小分类器数的可能ECOC矩阵的近似值为$\frac{V_{2 B}^N}{2}=\frac{2^{B} !}{2\left(2^B-N\right) !}$。除了巨大的基数之外，很容易表明这个空间既不是连续的也不是可微的，因为仅仅改变矩阵的一位就可能产生错误的编码设计。

在这种类型的场景中，引入进化方法通常会带来良好的结果。进化算法是受达尔文进化论启发的一大类方法，过去常被表述为求解空间既不可微也不能很好定义的优化过程。在这些情况下，使用计算机模拟自然进化过程的结果是随机优化技术，当应用于困难的现实世界问题时，这种技术通常优于经典的优化方法。虽然使用和研究最多的进化算法是遗传算法(Genetic algorithms, GA)，但从1995年Baluja和Caruana[4]发表的基于种群的增量学习(Population Based Incremental Learning, PBIL)开始，一个新的进化方法家族正在努力在这一领域找到一席之地。与$\mathrm{GA}$相比，这些新算法将染色体中的每个值视为随机变量，其目标是学习一个概率模型来描述优秀个体的特征。在PBIL的情况下，如果使用双染色体，则学习均匀分布以估计每个变量为1或0的概率。

在本章中，我们报告了用选择的进化策略-即GA和PBIL进行的实验。注意，对于两种进化策略，编码步骤和适应函数是完全相同的。

计算机代写|机器学习代写machine learning代考|Problem encoding

问题编码:使用进化算法的第一步是定义问题编码，问题编码包括通过基因型或染色体在搜索空间中表示某个解或点[14]。当溶液或个体被转化以在染色体中表示时，原始值(个体)被称为表型，而表型的每个可能设置都是等位基因。二进制编码是最常见的，主要是因为关于GA的第一个作品使用了这种类型的编码。在二进制编码中，每条染色体都是一串比特。虽然这种编码对于许多问题来说往往是不自然的，有时在交叉和/或突变之后必须进行修正，但在我们的例子中，染色体代表二进制ECOC矩阵，因此，这种编码完美地适应了问题。每个ECOC被编码为一个二进制染色体$\zeta=$，其中$h_i^{c_j} \in{0,1}$是类$c_j$的$i$ -第一个分类器的期望值，它对应于类$c_j$码字的$i-t h$位。

适应函数:一旦编码被定义，我们需要定义适应函数，它与每个个体对环境的适应值相关联，从而与他们的生存概率相关联。对于ECOC框架，自适应值必须与分类误差相关。

给定一条含有$\zeta_i \in{0,1}$的染色体$\zeta=\left\langle\zeta_0, \zeta_1, \ldots, \zeta_L>\right.$，第一步是恢复在该染色体中编码的ECOC矩阵$M$。$M$的元素允许根据ECOC列定义的分区，从原始的多类问题创建二元分类问题。每个二进制问题都是通过一个二进制分类器来解决的，该分类器是为了分离类的两个分区而训练的。假设存在一个函数$y=f(x)$，它将每个样本$x$映射到它的真实标签$y$，那么训练一个分类器就是找到某个函数$y=f^{\prime}(x, w)$的最佳参数$w^$，因为对于任何其他的$w \neq w^, f^{\prime}\left(x, w^\right)$都比$f^{\prime}(x, w)$更接近$f$。一旦对每个二值问题估计出$w^$，其自适应值就对应于分类误差。为了考虑训练的分类器的泛化能力，对样本的一个子集执行$w^*$的估计，而其余的样本保留给一个验证集，并且自适应值$\xi$是该验证子集上的分类误差。以某条染色体$\zeta_i$为代表的个体的适应值可表示为:
$$
\varepsilon_i\left(P, Y, M_i\right)=\frac{\sum_{j=1}^s \delta\left(\rho_j, M_i\right) \neq y_j}{s},
$$
其中$M_i$是在$\zeta_i, P=\left\langle\rho_1, \ldots, \rho_s\right\rangle$中编码的ECOC矩阵(一组样本)，$Y=\left\langle y_1, \ldots, y_s\right\rangle$是$P$中样本的期望标签，$\delta$是应用解码策略返回分类标签的函数。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|机器学习代写machine learning代考|MKTG6010

Posted on 2023年8月16日2023年8月28日 by statistics-lab

计算机代写|机器学习代写machine learning代考|Local Binary Patterns

The local binary pattern (LBP) operator [14] is a powerful $2 \mathrm{D}$ texture descriptor that has the benefit of being somewhat insensitive to variations in the lighting and orientation of an image. The method has been successfully applied to applications such as face recognition [1] and facial expression recognition [16]. As illustrated in Fig. 1.2, the LBP algorithm associates each interior pixel of an intensity image with a binary code number in the range $0-256$. This code number is generated by taking the surrounding pixels and, working in a clockwise direction from the top left hand corner, assigning a bit value of 0 where the neighbouring pixel intensity is less than that of the central pixel and 1 otherwise. The concatenation of these bits produces an eight-digit binary code word which becomes the grey-scale value of the corresponding pixel in the transformed image. Figure 1.2 shows a pixel being compared with its immediate neighbours. It is however also possible to compare a pixel with others which are separated by distances of two, three or more pixel widths, giving rise to a series of transformed images. Each such image is generated using a different radius for the circularly symmetric neighbourhood over which the LBP code is calculated.

Another possible refinement is to obtain a finer angular resolution by using more than 8 bits in the code-word [14]. Note that the choice of the top left hand corner as a reference point is arbitrary and that different choices would lead to different LBP codes; valid comparisons can be made, however, provided that the same choice of reference point is made for all pixels in all images.

It is noted in [14] that in practice the majority of LBP codes consist of a concatenation of at most three consecutive sub-strings of $0 \mathrm{~s}$ and $1 \mathrm{~s}$; this means that when the circular neighbourhood of the centre pixel is traversed, the result is either all $0 \mathrm{~s}$, all $1 \mathrm{~s}$ or a starting point can be found which produces a sequence of 0 s followed by a sequence of $1 \mathrm{~s}$. These codes are referred to as uniform patterns and, for an 8 bit code, there are 58 possible values. Uniform patterns are most useful for texture discrimination purposes as they represent local micro-features such as bright spots, flat spots and edges; non-uniform patterns tend to be a source of noise and can therefore usefully be mapped to the single common value 59 .

In order to use LBP codes as a face expression comparison mechanism it is first necessary to subdivide a face image into a number of sub-windows and then compute the occurrence histograms of the LBP codes over these regions. These histograms can be combined to generate useful features, for example by concatenating them or by comparing corresponding histograms from two images.

计算机代写|机器学习代写machine learning代考|Fast Correlation-Based Filtering

Broadly speaking, feature selection algorithms can be divided into two groups: wrapper methods and filter methods [3]. In the wrapper approach different combinations of features are considered and a classifier is trained on each combination to determine which is the most effective. Whilst this approach undoubtedly gives good results, the computational demands that it imposes render it impractical when a very large number of features needs to be considered. In such cases the filter approach may be used; this considers the merits of features in themselves without reference to any particular classification method.

Fast correlation-based filtering (FCBF) has proved itself to be a successful feature selection method that can handle large numbers of features in a computationally efficient way. It works by considering the classification between each feature and the class label and between each pair of features. As a measure of classification the concept of symmetric uncertainty is used; for a pair random variables $X$ and $Y$ this is defined as:
$$
S U(X, Y)=2\left[\frac{I G(X, Y)}{H(X)+H(Y)}\right]
$$
where $H(\cdot)$ is the entropy of the random variable and $I G(X, Y)=H(X)-H(X \mid Y)=$ $H(Y)-H(Y \mid X)$ is the information gain between $X$ and $Y$. As its name suggests, symmetric uncertainty is symmetric in its arguments; it takes values in the range $[0,1]$ where 0 implies independence between the random variables and 1 implies that the value of each variable completely predicts the value of the other. In calculating the entropies of Eq. 1.6, any continuous features must first be discretised.

The FCBF algorithm applies heuristic principles that aim to achieve a balance between using relevant features and avoiding redundant features. It does this by selecting features $f$ that satisfy the following properties:

$S U(f, c) \geq \delta$ where $c$ is the class label and $\delta$ is a threshold value chosen to suit the application.
$\forall g: S U(f, g) \geq S U(f, c) \Rightarrow S U(f, c) \geq S U(g, c)$ where $g$ is any feature other than $f$.

Here, property 1 ensures that the selected features are relevant, in that they are correlated with the class label to some degree, and property 2 eliminates redundant features by discarding those that are strongly correlated with a more relevant feature.

机器学习代考

计算机代写|机器学习代写machine learning代考|Local Binary Patterns

局部二元模式(LBP)算子[14]是一种功能强大的$2 \mathrm{D}$纹理描述符，其优点是对图像的光照和方向变化不敏感。该方法已成功应用于人脸识别[1]、面部表情识别[16]等应用。如图1.2所示，LBP算法将强度图像的每个内部像素与范围为$0-256$的二进制码数相关联。这个代码号是通过取周围的像素，从左上角开始顺时针方向工作，在邻近像素强度小于中心像素时分配位值0，否则分配位值1来生成的。这些位的连接产生一个8位二进制码字，它成为转换后的图像中相应像素的灰度值。图1.2显示了一个像素与其近邻的比较。然而，也可以将一个像素与被两个、三个或更多像素宽度的距离隔开的其他像素进行比较，从而产生一系列转换后的图像。每个这样的图像都是使用不同半径的圆对称邻域来生成的，LBP代码是在这个邻域上计算的。

另一种可能的改进是通过在码字中使用超过8位来获得更精细的角度分辨率[14]。注意，左上角作为参考点的选择是任意的，不同的选择将导致不同的LBP代码;然而，只要对所有图像中的所有像素选择相同的参考点，就可以进行有效的比较。

在[14]中指出，在实践中，大多数LBP码由最多三个连续的$0 \mathrm{~s}$和$1 \mathrm{~s}$子串组成;这意味着当遍历中心像素的圆形邻域时，结果要么是全部$0 \mathrm{~s}$，全部$1 \mathrm{~s}$，要么可以找到一个起点，它产生一个0 s序列，后面是一个$1 \mathrm{~s}$序列。这些代码被称为统一模式，对于一个8位的代码，有58个可能的值。均匀的图案在纹理识别中最有用，因为它们代表了局部的微特征，如亮点、平斑和边缘;不均匀的模式往往是噪声源，因此可以有效地映射到单一的公共值59。

为了使用LBP码作为人脸表情比较机制，首先需要将人脸图像细分为多个子窗口，然后计算这些区域上LBP码的出现直方图。这些直方图可以组合起来生成有用的特征，例如通过连接它们或比较来自两幅图像的相应直方图。

计算机代写|机器学习代写machine learning代考|Fast Correlation-Based Filtering

包装器方法考虑了不同的特征组合，并在每种组合上训练分类器，以确定哪种组合最有效。虽然这种方法无疑给出了很好的结果，但当需要考虑非常大量的特征时，它所施加的计算需求使其不切实际。在这种情况下，可以使用过滤器方法;这考虑了特征本身的优点，而不参考任何特定的分类方法。

快速相关滤波(Fast correlation-based filtering, FCBF)是一种成功的特征选择方法，能够以高效的计算方式处理大量特征。它通过考虑每个特征与类标号之间以及每对特征之间的分类来工作。作为分类的度量，对称不确定性的概念被使用;对于一对随机变量$X$和$Y$，定义为:
$$
S U(X, Y)=2\left[\frac{I G(X, Y)}{H(X)+H(Y)}\right]
$$
其中$H(\cdot)$为随机变量的熵，$I G(X, Y)=H(X)-H(X \mid Y)=$$H(Y)-H(Y \mid X)$为$X$与$Y$之间的信息增益。顾名思义，对称不确定性在其参数中是对称的;它的取值范围为$[0,1]$，其中0表示随机变量之间的独立性，1表示每个变量的值完全预测另一个变量的值。在计算Eq. 1.6的熵时，必须首先对任何连续特征进行离散。

FCBF算法采用启发式原则，目的是在使用相关特征和避免冗余特征之间取得平衡。它通过选择满足以下属性的功能$f$来实现这一点:

$S U(f, c) \geq \delta$ 其中$c$是类标签，$\delta$是为适应应用程序而选择的阈值。

$\forall g: S U(f, g) \geq S U(f, c) \Rightarrow S U(f, c) \geq S U(g, c)$ 其中$g$是除$f$之外的任何特性。

在这里，属性1确保所选择的特征是相关的，因为它们在某种程度上与类标签相关，而属性2通过丢弃那些与更相关的特征强烈相关的特征来消除冗余特征

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

金融工程代写

非参数统计代写

非参数统计指的是一种统计方法，其中不假设数据来自于由少数参数决定的规定模型；这种模型的例子包括正态分布模型和线性回归模型。

广义线性模型代考

广义线性模型（GLM）归属统计学领域，是一种应用灵活的线性回归模型。该模型允许因变量的偏差分布有除了正态分布之外的其它分布。

有限元方法代写

随机分析代写

时间序列分析代写

回归分析代写

MATLAB代写

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写

计算机代写|机器学习代写machine learning代考|COMP4318

Posted on 2023年8月14日2023年8月28日 by statistics-lab

计算机代写|机器学习代写machine learning代考|Bulk external delivery

The considerations for bulk external delivery aren’t substantially different from internal use serving to a database or data warehouse. The only material differences between these serving cases are in the realms of delivery time and monitoring of the predictions.
DELIVERY CONSISTENCY
Bulk delivery of results to an external party has the same relevancy requirements as any other ML solution. Whether you’re building something for an internal team or generating predictions that will be end-user-customer facing, the goal of creating useful predictions doesn’t change.

The one thing that does change with providing bulk predictions to an outside organization (generally applicable to business-to-business companies) when compared to other serving paradigms is in the timeliness of the delivery. While it may be obvious that a failure to deliver an extract of bulk predictions entirely is a bad thing, an inconsistent delivery can be just as detrimental. There is a simple solution to this, however, illustrated in the bottom portion of figure 16.14.

Figure 16.14 shows the comparison of gated and ungated serving to an external user group. By controlling a final-stage egress from the stored predictions in a scheduled batch prediction job, as well as coupling feature-generation logic to an ETL process governed by a feature store, delivery consistency from a chronological perspective can be guaranteed. While this may not seem an important consideration from the DS perspective of the team generating the predictions, having a predictable dataavailability schedule can dramatically increase the perceived professionalism of the serving company.

计算机代写|机器学习代写machine learning代考|QUALITY ASSURANCE

An occasionally overlooked aspect of serving bulk predictions externally (external to the DS and analytics groups at a company) is ensuring that a thorough quality check is performed on those predictions.

An internal project may rely on a simple check for overt prediction failures (for example, silent failures are ignored that result in null values, or a linear model predicts infinity). When sending data products externally, additional steps should be done to minimize the chances of end users of predictions finding fault with them. Since we, as humans, are so adept at finding abnormalities in patterns, a few scant issues in a batch-delivered prediction dataset can easily draw the focus of a consumer of the data, deteriorating their faith in the efficacy of the solution to the point of disuse.

In my experience, when delivering bulk predictions external to a team of data specialists, I’ve found it worthwhile to perform a few checks before releasing the data:

Validate the predictions against the training data:
Classification problems-Comparing aggregated class counts
Regression problems-Comparing prediction distribution
Unsupervised problems-Evaluating group membership counts
Check for prediction outliers (applicable to regression problems).
Build (if applicable) heuristics rules based on knowledge from SMEs to ensure that predictions are not outside the realm of possibility for the topic.
Validate incoming features (particularly encoded ones that may use a generic catchall encoding if the encoding key is previously unseen) to ensure that the data is fully compatible with the model as it was trained.

By running a few extra validation steps on the output of a batch prediction, a great deal of confusion and potential lessening of trust in the final product can be avoided in the eyes of end users.

机器学习代考

计算机代写|机器学习代写machine learning代考|Bulk external delivery

批量外部交付的考虑因素与数据库或数据仓库的内部使用没有本质上的区别。这些服务案例之间唯一的实质性区别在于交付时间和预测监控方面。
交付的一致性
将结果批量交付给外部方与任何其他ML解决方案具有相同的相关性要求。无论您是在为内部团队构建某些东西，还是生成面向最终用户-客户的预测，创建有用预测的目标都不会改变。

与其他服务范式相比，向外部组织提供批量预测(通常适用于b2b公司)确实改变了一件事，那就是交付的及时性。虽然很明显，不能完全交付批量预测的摘要是一件坏事，但不一致的交付可能同样有害。但是，有一个简单的解决方案，如图16.14的底部部分所示。

图16.14显示了为外部用户组服务的门控和不门控的比较。通过控制计划批处理预测作业中存储的预测的最后阶段出口，以及将特征生成逻辑耦合到由特征存储管理的ETL过程，可以保证从时间顺序角度来看交付的一致性。虽然从生成预测的团队的DS角度来看，这似乎不是一个重要的考虑因素，但拥有可预测的数据可用性时间表可以显著提高服务公司的专业水平。

计算机代写|机器学习代写machine learning代考|QUALITY ASSURANCE

在向外部(公司的DS和分析组外部)提供批量预测时，一个偶尔被忽视的方面是确保对这些预测执行彻底的质量检查。

内部项目可能依赖于对公开预测失败的简单检查(例如，忽略导致空值的静默失败，或者线性模型预测无穷大)。在向外部发送数据产品时，应该采取额外的步骤，以尽量减少预测的最终用户发现错误的可能性。作为人类，我们非常善于发现模式中的异常，因此批量交付的预测数据集中的一些小问题很容易吸引数据消费者的注意力，从而降低他们对解决方案有效性的信心，直至不再使用。

根据我的经验，当向数据专家团队外部交付批量预测时，我发现在发布数据之前执行一些检查是值得的:

根据训练数据验证预测:

分类问题——比较聚合类计数

回归问题-比较预测分布

不受监督的问题——评估团队成员数量

检查预测异常值(适用于回归问题)。

基于中小企业的知识构建启发式规则(如果适用)，以确保预测不会超出主题的可能性范围。

验证传入的特性(特别是编码的特性，如果编码键以前未见过，则可能使用通用的通用编码)，以确保数据在训练时与模型完全兼容。

通过在批预测的输出上运行一些额外的验证步骤，可以避免在最终用户眼中对最终产品的大量混淆和潜在的信任度降低。

统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。

R语言代写	问卷设计与分析代写
PYTHON代写	回归分析与线性模型代写
MATLAB代写	方差分析与试验设计代写
STATA代写	机器学习/统计学习代写
SPSS代写	计量经济学代写
EVIEWS代写	时间序列分析代写
EXCEL代写	深度学习代写
SQL代写	各种数据建模与可视化代写