## 数学代写|理论计算机代写theoretical computer science代考|Our Proposed Algorithm

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Our Proposed Algorithm

The proposed framework and algorithm is described in this section. The classifier works better with more labeled data, unlabeled data is much more than labeled data in semi supervised algorithms. If unlabeled data is labeled correctly and added to the labeled set, performance of the classifier improves.

Our proposed algorithm is a semi-supervised self-training algorithm. Figure 4 shows the steps of the proposed framework. In the first step, we find $\delta$ and $\rho$ for all training data set (labeled and unlabeled) and then the high density peaks are found in the labeled data set. The number of peaks is the number of classes. Then, for each peak we find corresponding far point. Step 2 consists of two parts. One section is about selecting a set of unlabeled data which is the candidate for labeling. The distance of all unlabeled data are calculated from decision boundary. Next unlabeled data are selected for labeling that are closer to the decision boundary. Threshold (distance of decision boundary) is considered the mean distance of all unlabeled data from decision boundary.

Another part of this step is how to label unlabeled data. Our proposed method predicts the label of unlabeled data by high density peak and Apollonius circle concepts. The label of peak $k_{i}$ is assigned to the unlabeled data that are inside the Apollonius circle corresponding $p e a k_{i}$ and in parallel the same unlabeled subset is given to SVM to label them. The agreement based on the classifier predictions and Apollonius circle is used to select a newly-labeled set The labeled set is updated and is used to retrain the classifier. The process is repeated until it reaches the stopping condition. Figure 4 represents the overview of proposed algorithm.

## 数学代写|理论计算机代写theoretical computer science代考|Experimental Setup

In this section, we perform several experiments to compare the classification performance of our proposed algorithm to the state-of-the-art semi-supervised methods using several different datasets. We also setup several experiments to show, the impact of selecting data close the decision boundary for improving the classification performance.

In the experiment, some UCI datasets are used. Table 1 summarizes the specification of 8 benchmark datasets from the UCI data repository which are used in our experiments.For each dataset, $30 \%$ of data are kept as test set randomly and the rest are used for training set. The training set is divided into two sets of labeled data and unlabeled data. Classes are selected at a proportions equal to the original dataset for all sets. Initially, we have assigned $5 \%$ to the labeled data and $95 \%$ to the unlabeled data. We have repeated each experiment ten times with different subsets of training and testing data. We report the mean accuracy rate (MAR) of 10 times repeating the experiments.

Table 2 shows the comparison of our algorithm with some other algorithms when labeled data is $10 \%$. The second and third columns in this table give respectively the performance of the supervised SVM and self-training SVM. The fourth col$u m n$ is the performance of state-of-the-art algorithm that is called STC-DPC algorithm [7]. The last column is the performance of our algorithm. Base learner for all of algorithms is SVM. Cut of distance parameter (dc) for our algorithm and STC-DPC is $0.05$. From Table 2 , we observe that Our algorithm works well for datasets that have a separate data density such as Iris, Seeds, Wine. Our proposed algorithm doesn’t work very well if dataset is very mixed, such as banknote Fig. 5 and Fig. 6. We also investigate the behavior of the algorithms based on increasing ratio of labeled data. Fig. 7 is a comparison of the three algorithms with the increase ratio of label data from $5 \%$ to $50 \%$.

## 数学代写|理论计算机代写theoretical computer science代考|Impact of Selecting Data Close the Decision Boundary

In most datasets, labeling all unlabeled data can not improve the performance but also reduces the accuracy. In addition to decreasing accuracy, runtime is also increased. Although the unlabeled data that are far from the decision boundary are more reliable, they are not informative. They play little role in improving decision boundary. That’s why we haven’t added all the unlabeled data to the

training set, rather, we add those that are closer to the decision boundary than a certain value.

We show the results on a number of datasets in Table 3 . The second column is the accuracy of our algorithm when we have added all the unlabeled data and the third column is the accuracy of our algorithm when we add only the data point closes to the decision boundary. As can be seen from Table 3 , the accuracy of the algorithm increases when we only add unlabeled data closer to the decision boundary instead of all the points.

In this paper, we proposed a semi-supervised self-training method based on Apollonius, named SSApolo. First candidate data are selected from among the unlabeled data to be labeled in the self training process, then, using the density peak clustering, the peak points are found and by making an Apollonius circle of each peak points, their neighbors are found and labeled. Support vector machine is used for classification. A series of experiments was performed on some datasets and the performance of the proposed algorithm was evaluated. According to the experimental results, we conclude that our algorithm performs better than STCDPC algorithm and supervised SVM and self-training SVM, especially when classes of dataset are not very mixed. In addition, the impact of selecting data are close to decision boundary was investigated. We find that selecting data are close to decision boundary can improves the performance.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考|Margin-Based Semi-supervised Learning

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Review of Concepts Related to the Density Peaks and the Apollonius Circle

In this section, the concepts of the Apollonius circle are discussed. We propose the neighborhood structure of the Apollonius circle to label the unlabeled data.
Apollonius circle is one of the most famous problems in geometry [18]. The goal is to find accurate neighborhoods. In a set of points, there is no information about the relationship between the points and some databases may not even contain topological information. It is more accurate than the Gabriel graph [25] and neighborhood graph in the neighborhood construction. It evaluates important neighborhood entirely. In this method, the first step is to find high density points and the second step is to build neighborhood groups with the Apollonius circle. The third step is analyzing points outside the radius of the Apollonian circles or points within the region.

## 数学代写|理论计算机代写theoretical computer science代考|Finding High Density Points

Rodriguez and Laio [20] presented the algorithm to find high density points (DPC). The high density points are found by this method and then stored in an array.

The points are shown by the vector $\mathbf{M}=\left(M_{1 i}, M_{2 i}, \ldots, M_{m i}\right)$ where $m$ is the number of attributes, also $N_{M_{i}}$ shows $k$ nearest neighbors of $M_{i} \cdot d\left(M_{i}, M_{j}\right)$ is the Euclidean distance between $M_{i}$ and $M_{j}$. The percent of the neighborhood is shown by $p$. The number of neighbors is obtained by $r=p \times n$, where $n$ is the number of data points. The local density $\rho_{i}$ is then defined as:
$$\begin{gathered} \rho_{i}=\exp \left(-\left(\frac{1}{r} \sum_{M_{j} \in N\left(M_{i}\right)} d\left(M_{i}, M_{J}\right)^{2}\right)\right) \ d\left(M_{i}, M_{j}\right)=\left|M_{i}-M_{j}\right| \end{gathered}$$
where $\delta_{i}$ is the minimum distance between $M_{i}$ and any other sample with higher density than $p_{i}$, which is define as below:
$$\delta_{i}= \begin{cases}\min {\rho{i}<\rho_{j}}\left{d\left(M_{i}, M_{j}\right)\right}, & \text { if } \exists j \quad \rho_{i}<\rho_{j} \ \max {j}\left{d\left(M{i}, M_{j}\right)\right}, & \text { otherwise }\end{cases}$$
Peaks (high density points) are obtained using the score function. The points that have the highest score are considered as peak point.
$$\operatorname{score}\left(M_{i}\right)=\delta_{i} \times \rho_{i}$$
In this article, number of peaks are selected based on the number of classes. Peaks are selected from the labeled set. We assign the label of peaks to the unlabeled data based on neighborhood radius of peaks. Neighboring groups are found by the Apollonius circle.

## 数学代写|理论计算机代写theoretical computer science代考|Neighborhood Groups with the Apollonius Circle

The Apollonius circle is the geometric location of the points on the Euclidean plane which have a specified ratio of distances to two fixed points $\mathrm{A}$ and $\mathrm{B}$, this ratio is called $K$ [17]. Apollonius circle can be seen in Fig. 2 .
$$K=d_{1} / d_{2} .$$
The Apollonius circle based on $A$ and $B$ is defined as:
$$C_{A B}=\left{\begin{array}{lll} C_{A} & \text { if } & K<1 \\ C_{B} & \text { if } & K>1 \ C_{\text {inf }} & \text { if } & K=1 \end{array}\right.$$
After finding the high density points, we sort these points. The peak points are indicated by $P=\left(P_{1}, P_{2}, \ldots, P_{m}\right)$ which are arranged in pairs $\left(P_{t}, P_{t+1}\right), t \in{1,2, \ldots, m-1}$, the data points are denoted by $M=$ $\left{M_{i} \mid i \in{1,2, \ldots, n-m}, M_{i} \notin P\right}$. In the next step, data points far from the peak points are determined by the formula 7 . Finally the distance of the furthest point from the peak points is calculated by the formula 8 .
$$\begin{gathered} F d_{P_{t}}=\max \left{d\left(P_{t}, M_{i}\right) \mid M_{i} \in M \text { and } d\left(P_{t}, M_{i}\right)<d\left(P_{t}, P_{t+1}\right)\right. \ \text { and } \left.d\left(P_{t}, M_{i}\right)<\min {l=1, l \in P} d\left(P{l}, M_{i}\right) \text { s.t. } t \neq l\right} \ F P_{t}=\left{M_{i} \mid d\left(P_{t}, M_{i}\right)=F d_{t}\right} . \end{gathered}$$
The points between the peak point and the far point are inside the Apollonius circle, circle [18].

The above concepts are used to label the unlabeled samples confidently. In our proposed algorithm, label of the peak points are assigned to the unlabeled example which are inside the Apollonius circle. The steps are shown in Fig. 3 .

## 数学代写|理论计算机代写theoretical computer science代考|Finding High Density Points

Rodriguez 和 Laio [20] 提出了寻找高密度点 (DPC) 的算法。通过这种方法找到高密度点，然后将其存储在数组中。

ρ一世=经验⁡(−(1r∑米j∈ñ(米一世)d(米一世,米Ĵ)2)) d(米一世,米j)=|米一世−米j|

\delta_{i}= \begin{cases}\min {\rho{i}<\rho_{j}}\left{d\left(M_{i}, M_{j}\right)\right}, & \text { if } \exists j \quad \rho_{i}<\rho_{j} \ \max {j}\left{d\left(M{i}, M_{j}\right)\right}， & \text { 否则 }\end{cases}\delta_{i}= \begin{cases}\min {\rho{i}<\rho_{j}}\left{d\left(M_{i}, M_{j}\right)\right}, & \text { if } \exists j \quad \rho_{i}<\rho_{j} \ \max {j}\left{d\left(M{i}, M_{j}\right)\right}， & \text { 否则 }\end{cases}

## 数学代写|理论计算机代写theoretical computer science代考|Neighborhood Groups with the Apollonius Circle

ķ=d1/d2.

$$C_{AB}=\left{C一种 如果 ķ<1C乙 如果 ķ>1 C信息 如果 ķ=1\对。 找到高密度点后，我们对这些点进行排序。峰值点由 P=\left(P_{1}, P_{2}, \ldots, P_{m}\right) 表示，它们成对排列 \left(P_{t}, P_{t +1}\right), t \in{1,2, \ldots, m-1}，数据点记为M= \left{M_{i} \mid i \in{1, 2, \ldots, nm}, M_{i} \notin P\right}。在下一步中，远离峰值点的数据点由公式7确定。最后通过公式8计算最远点到峰值点的距离。找到高密度点后，我们对这些点进行排序。峰值点由 P=\left(P_{1}, P_{2}, \ldots, P_{m}\right) 表示，它们成对排列 \left(P_{t}, P_{t +1}\right), t \in{1,2, \ldots, m-1}，数据点记为M= \left{M_{i} \mid i \in{1, 2, \ldots, nm}, M_{i} \notin P\right}。在下一步中，远离峰值点的数据点由公式7确定。最后通过公式8计算最远点到峰值点的距离。 \begin{gathered} F d_{P_{t}}=\max \left{d\left(P_{t}, M_{i}\right) \mid M_{i} \in M \text { and } d \left(P_{t}, M_{i}\right)<d\left(P_{t}, P_{t+1}\right)\right。\ \text { 和 } \left.d\left(P_{t}, M_{i}\right)<\min {l=1, l \in P} d\left(P{l}, M_{i }\right) \text { st } t \neq l\right} \F P_{t}=\left{M_{i} \mid d\left(P_{t}, M_{i}\right)=F d_{t}\right} 。\结束{聚集}\begin{gathered} F d_{P_{t}}=\max \left{d\left(P_{t}, M_{i}\right) \mid M_{i} \in M \text { and } d \left(P_{t}, M_{i}\right)<d\left(P_{t}, P_{t+1}\right)\right。\ \text { 和 } \left.d\left(P_{t}, M_{i}\right)<\min {l=1, l \in P} d\left(P{l}, M_{i }\right) \text { st } t \neq l\right} \F P_{t}=\left{M_{i} \mid d\left(P_{t}, M_{i}\right)=F d_{t}\right} 。\结束{聚集}$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考| Hardness of Exact Computation

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Hardness of Exact Computation

In this section we provide several hardness results for the DOUBLE UPPER EC and the UPPER $r$-EC in some graph classes as regular graphs and split graphs.
Theorem 11. Let $G=(V, E)$ be an $(r+1)$-regular graph with $r \geq 2$. Then,
$$\operatorname{uec}{r}(G)=|E|-\operatorname{eds}(G) .$$ Proof. In order to prove this equation, first we will show that if $S \subseteq E$ is a $r$-tec of $G$, then $\bar{S}=E \backslash S$ is an edge dominating set of $G$. Let $\left(V{1}, V_{2}\right)$ be the associated partition related to $S$. By the Property 1 we know that $V_{2}$ is an independent set. Because our graph is $(r+1)$-regular, it is easy to see that $\forall v \in V_{1}, d_{G_{S}}(v)=r$ and $\forall u \in V_{2}, d_{G_{S}}(u)=r+1$. This observation gives us that the set $\bar{S}$ covers each vertex of $V_{1}$ (they have degree $r$ ) and that all the edges in $E$ are incident to a vertex in $V_{1}$ (because $V_{2}$ is an independent set). So, $\bar{S}$ is an edge dominating set.

Conversely, let $S$ be a solution of EDS. We will show that there exists a $r$-tec of size $|\bar{S}|$. Because EDS is equivalent to Lower EM, we consider the edge set $S$ as a maximal matching. Now, let $V^{\prime}=V \backslash V[S]$. Observe that $V^{\prime}$ is an independent set in $G$ and each vertex $v \in V^{\prime}$ has $d_{G_{S}}(v)=r+1$. The first holds because if there exists an edge $e$ between two vertices of $V^{\prime}$ then $S \cup{e}$ will be a matching with size greater than $S$, which contradicts the maximality of $S$. The second holds because the edges in $S$ are not incident to vertices of $V^{\prime}$ by definition, and thus, all the edges in $\bar{S}$ are incident to at least one vertex in $V[S]$. Finally, because $S$ is a matching we have that all the vertices in $V[S]$ have degree $r$ in $G_{\bar{S}}$ so by the Property $1, \bar{S}$ is a $r$-tec. So the Eq. (6) holds.
Corollary 12. UPPER $r$-EC is $\mathbf{N P}$-hard in $(r+1)$-regular bipartite graphs.
Proof. Using the NP-hardness proof of EDS in $r$-regular bipartite graphs given in $[15]$, the results follows from Theorem $11 .$
Corollary 13. DOUBLE UPPER EC is NP-hard in cubic planar graphs.
Proof. Using the NP-hardness proof of EDS for cubic planar graphs given in [23], the results follows from Theorem $11 .$

Theorem 14. UPPER $(r+1)$-EC is NP-hard in graphs of maximum degree $\Delta+1$ if UPPER $r$-EC is $N P$-hard in graphs of maximum degree $\Delta$, and this holds even for bipartite graphs.

Proof. Let $G=(V, E)$ be a bipartite graph of maximum degree $\Delta$, we construct a new graph $G^{\prime}=\left(V^{\prime}, E^{\prime}\right)$ by starting from $r+1$ copies of $G$. Then for each vertex $v \in V$ we add a new vertex $u_{v}$ in $G^{\prime}$ and connect it to each one of the $r+1$ copies of the vertex $v$.

## 数学代写|理论计算机代写theoretical computer science代考|Hardness of Approximation

In the following theorems we provide some inapproximability results for the UPPER EC and the DOUBLE UPPER EC.

Theorem 17. It is NP-hard to approximate the solution of UPPER EC to within $\frac{593}{594}$ and $\frac{363}{364}$ in graphs of max degree 4 and 5 respectively.

Proof. In order to prove this we will use a reduction from Min VC. Starting from an $r$-regular graph $G=(V, E)$ we will construct a new graph $G^{\prime}$; first we add a $P_{2}$ to each vertex $v \in V$. Then for each edge $e=v u \in E$ we add a new vertex $v_{e}$ adjacent to $v$ and $u$. In the end we remove all the starting edges $E$. Since Min VC can not be approximated to within a factor $\frac{100}{99}$ (resp. $\frac{53}{52}$ ) in 3-regular (resp. 4-regular) graphs [11], it deduces the expected results.

Theorem 18. It is NP-hard to approximate the solution of DOUBLE UPPER EC to within $\frac{883}{884}$ in graphs of max degree 6 .

Proof. In order to get the inapproximability result, we first make a reduction from MiN VC to UPPER. EC similar to Theorem 17, then we reduce it to DOUBLE UPPER EC using a reduction proposed in Theorem $14 .$

Theorem 19. It is NP-hard to approximate the solution of DOUBLE UPPER. EC to within $\frac{571}{572}$ in graphs of max degree 9 .

Proof. Again, we will make a reduction from VerTex Cover problem. Starting from a 4-regular graph $G=(V, E)$ we construct a new graph by adding a set of new vertices $V_{E}$ which has one vertex $v_{e}$ for each edge $e \in E$, and then adding new edges $v_{\varepsilon} u$ if the edge $e$ was incident to $u$ in the original graph $G$. Let $G^{\prime}=\left(V^{\prime}, E^{\prime}\right)$ be the new graph. It is easy to see that $\left|V^{\prime}\right|=|V|+|E|$ and $\Delta\left(G^{\prime}\right)=2 \Delta(G)=8$. Furthermore, we can show that from any $V C$ of $G$ we can

construct a tec of $G^{\prime}$ of size $|T E C|=|E|+|V|-|V C|$ and conversely, from any tec of $G^{\prime}$ we can construct a $V C$ of $G$ of size $|V C|=|E|+|V|-|T E C|$. In order to prove the first direction we will start from a $V C$ of $G$. Let $S$ be the set of all the edges $v_{c} u$ where $u \in V C . S$ is a partial tec of $G$ because it covers only the vertices in $V C \cup V_{E}$, any vertex of $V_{E}$ has degree one in $G_{S}^{\prime}$ and the vertices of $V C$ are independent in $G_{S}^{\prime}$. It is easy to extend $S$ to a tec of $G^{\prime}$ by adding one edge for every vertex $v \in V \backslash V C$ that is adjacent to a vertex in $V C$ (there exists because $v \in V$ and $V C$ is a vertex cover of $G$ ). The extended $S$ is a tec due to the fact that the vertices that may have greater degree than one are all in $V C$, which is independent in $G_{S}$ and all the vertices are covered. Now we have to observe that this tec contains exactly one edge for each vertex in $V_{E}$ and one for each vertex in $V \backslash V C$ so the size is exactly
$$|T E C|=|E|+|V|-|V C| .$$
Conversely, we will start from a tec of $G^{\prime}$ and we will construct a vertex cover of $G$ of the wanted size. First we have to show that for any tec $S$ of $G^{\prime}$, if there exists $v_{e} \in V_{E}$ such that $d_{G_{s}^{\prime}}\left(v_{e}\right)=2$ (it can not be greater because $d_{G^{\prime}}\left(v_{c}\right)=2$ ) then there exists an other tec $S^{\prime}$ of $G^{\prime}$ that has the same size and every vertex $v_{e} \in V_{E}$ has degree $d_{G_{S^{\prime}}}\left(v_{e}\right)=1$. This is easy to prove by repeating the following:
If there exists $e=u v \in E$ such that $d_{G_{S^{\prime}}}\left(v_{e}\right)=2$ then $S^{\prime}:=S^{\prime}+v_{e} u-v_{e} v$.

## 数学代写|理论计算机代写theoretical computer science代考|Using Apollonius Circle

Typically, supervised learning methods are useful when there is enough labeled data, but in many real word tasks, unlabeled data is available. Furthermore, in practice, labeling is an expensive and time consuming task, because it needs human experience and efforts [14]. Therefore, finding an approach which can employ both labeled and unlabeled data to construct a proper model is crucial. Such a learning approach is named semi-supervised learning. In semi-supervised learning algorithms, we use labeled data as well as unlabeled data. The main goal of semi-supervised learning is to employ unlabeled instances and combine the information in the unlabeled data with the explicit classification information of labeled data to improve the classification performance. The main challenge

in semi-supervised learning is how to extract knowledge from the unlabeled data $[4,21,30]$. Several different algorithms for semi-supervised learning have been introduced, such as the Expectation Maximization (EM) based algorithms $[11,13,15]$, self-training $[7,9,23]$, co-training $[19,27]$, Transduction Support Vector Machine (TSVM) $[1,12,26]$, Semi-Supervised SVM (S3VM) [3], graph-based methods $[2,28]$, and boosting based semi-supervised learning methods $[5,16,24]$.
Most of the semi-supervised algorithms follow two main approaches, extension of a specific base learner to learn from labeled and unlabeled examples and using a framework to learn from both labeled and unlabeled data regardless of the used base learner. Examples of the first approach include S3VM. TSVM. and LapSVM. Wrapper-based and Boosting-based methods follow the second approach, like Co-training, SemiBoost [16], MSAB [24], and MSSBoost [22]. In this article, we focus on both approaches and propose a novel semi-supervised approach based on the SVM base learner.

Support vector machine is proposed by Cortes and Vapnik $[6]$ and is one of the promising base learners in many practical domains, such as object detection, document and web-page categorization. It is a supervised learning method based on margin as well as statistical learning theory [8]. The purpose of the support vector machine algorithm is to find an optimal hyperplane in an N-dimensional space in order to classify the data points. As shown in Fig. 1 , there are many possible hyperplanes that could be selected. The optimal classification hyperplane of SVM needs not only segregating the data points correctly, but also maximizing the margin [8]. Maximizing the margin leads to a strong classification model. The standard form of SVM is only used for supervised learning tasks. This base learner can not directly handle the semi-supervised classification tasks. There are several extensions to SVM, like S3VM and TSVM. These methods use the unlabeled data to regularize the decision boundary. These methods mainly extend the SVM base classifier to semi-supervised learning, which are computationally expensive $[16]$. Therefore, these approaches are suitable only for small datasets. More recently, a semi-supervised self-training has been used to handle the semi-supervised classification tasks $[7,23,29]$. Self-training is a wrapper-based algorithm that repeatedly uses a supervised learning method. It starts to train on labeled data only. At each step, a set of unlabeled points is labeled according to the current decision function; then the supervised method is retrained using its own predictions as additional labeled points [23]. However, the performance of this method depends on correctly predicting the labels of unlabeled data. This is important, because the selection of incorrect predictions will propagate to produce further classification errors. In general, there are two main challenges. The first is to select the appropriate candidate set of unlabeled examples to label at each iteration of the training procedure. The second is to correctly predict labels to unlabeled examples. To handle these issues, the recent studies tend to find a set of high-confidence predictions at each iteration $[7,23,29]$. These selected examples are typically far away from the decision boundary. Hence, this type of algorithm cannot effectively exploit information from the unlabeled data and the final decision boundary will be very close to that of the initial classifier [22].

## 数学代写|理论计算机代写theoretical computer science代考|Hardness of Approximation

|吨和C|=|和|+|在|−|在C|.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考|On the Complexity of the Upper $r$-Tolerant Edge Cover Problem

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|On the Complexity of the Upper $r$-Tolerant Edge Cover Problem

In this paper we define and study tolerant edge cover problems. An edge cover of a graph $G=(V, E)$ without isolated vertices is a subset of edges $S \subseteq E$ which covers all vertices of $G$, that is, each vertex of $G$ is an endpoint of at least one edge in $S$. The edge cover number of a graph $G=(V, E)$, denoted by ec $(G)$, is the minimum size of an edge cover of $G$ and it can be computed in polynomial time (see Chapter 19 in [29]). An edge cover $S \subseteq E$ is called minimal (with respect to inclusion) if no proper subset of $S$ is an edge cover. Minimal edge cover is also known in the literature as an enclaveless set [30] or as a nonblocker set [14]. While a minimum edge cover can be computed efficiently, finding the largest minimal edge cover is NP-hard [27], where it is shown that the problem is equivalent to finding a dominating set of minimum size. The associated optimization problem is called upper edge cover (and denoted UPPER EC) [1] and the corresponding optimal value will be denoted uec $(G)$ in this paper for the graph $G=(V, E)$.
Here, we are interested in minimal edge cover solutions tolerant to the failures of at most $r-1$ edges. Formally, given an integer $r \geq 1$, an edge subset $S \subseteq E$ of $G=(V, E)$ is a tight $r$-tolenant edge-cover ( $r$-tec for short) if the deletion of

any set of at most $r-1$ edges from $S$ maintains an edge cover ${ }^{1}$ and the deletion of any edge from $S$ yields a set which is not a (tight) $r$-tolerant edge cover. Equivalently, we seek an edge subset $S$ of $G$ such that the subgraph $(V, S)$ has minimum degree $r$ and it is minimal with this property. For the sake of brevity we will omit the word ‘tight’ in the rest of the paper. Note that the case $r=1$ corresponds to the standard notion of minimal edge cover.

As an illustrating example consider the situation in which the mayor of a big city seeks to hire a number of guards, from a security company, who will be constantly patrolling streets between important buildings. An $r$-tolerant edge cover reflects the desire of the mayor to guarantee that the security is not compromised even if $r-1$ guards are attacked. Providing a maximum cover would be the goal of a selfish security company, who would like to propose a patrolling schedule with as many guards as possible, but in which all the proposed guards are necessary in the sense that removing any of them would leave some building not $r$-covered.

Related Work. UPPER EC has been investigated intensively during recent years, mainly using the terminologies of spanning star forests and dominating sets. A dominating set in a graph is a subset $S$ of vertices such that any vertex not in $S$ has at least one neighbor in $S$. The minimum dominating set problem (denoted MinDS) seeks the smallest dominating set of $G$ of value $\gamma(G)$. We have the equality $\operatorname{uec}(G)=n-\gamma(G)[27]$.

Thus, using the complexity results known for MinDS, we deduce that UPPER EDGE COVER is NP-hard in planar graphs of maximum degree 3 [20], chordal graphs [6] (even in undirected path graphs, the class of vertex intersection graphs of a collection of paths in a tree), bipartite graphs, split graphs [5] and $k$-trees with arbitrary $k[12]$, and it is polynomial in $k$-trees with fixed $k$, convex bipartite graphs [13], strongly chordal graphs [16]. Concerning the approximability, an APX-hardness proof with explicit inapproximability bound and a combinatorial 0.6-approximation algorithm is proposed in [28]. Better algorithms with approximation ratio $0.71$ and $0.803$ are given respectively in [9] and [2]. For any $\varepsilon>0$, UPPER EDGE COVER is hard to approximate within a factor of $\frac{259}{260}+\varepsilon$ unless $\mathbf{P}=\mathbf{N P}$ [28]. The weighted version of the problem, denoted as UPPER WEIGHTED EDGE CoVER, have been recently studied in [24], in which it is proved that the problem is not $O\left(\frac{1}{n^{1 / 2-\varepsilon}}\right)$ approximable nor $O\left(\frac{1}{\Delta^{1-\varepsilon}}\right)$ in edge weighted graphs of order $n$ and maximum degree $\Delta$.

## 数学代写|理论计算机代写theoretical computer science代考|Definitions

Graph Notation and Terminology. Let $G=(V, E)$ be a graph and $S \subseteq V$; $N_{G}(S)={v \in V: \exists u \in S, v u \in E}$ denotes the neighborhood of $S$ in $G$ and $N_{G}[S]=S \cup N_{G}(S)$ denotes the closed neighborhood of $S$. For singleton sets $S={s}$, we simply write $N_{G}(s)$ or $N_{G}[s]$, even omitting $G$ if clear from the context. The maximum degree and minimum degree of a graph are denoted $\Delta(G)$ and $\delta(G)$ respectively. For a subset of edges $S, V(S)$ denotes the vertices that are incident to edges in $S$. A vertex set $U \subseteq V$ induces the graph $G[U]$ with vertex set $U$ and $e \in E$ being an edge in $G[U]$ iff both endpoints of $e$ are in $U$. If $S \subseteq E$ is an edge set, then $\vec{S}=E \backslash S$, edge set $S$ induces the graph $G[V(S)]$, while $G_{S}=(V, S)$ denotes the partial graph induced by $S$. In particular, $G_{\bar{S}}=(V, E \backslash S)$. Let also $\alpha(G)$ and $\gamma(G)$ denote the size of the largest independent and smallest dominating set of $G$, respectively.

An edge set $S$ is called edge cover if the partial graph $G_{S}$ is spanning and it is a matching if $S$ is a set of pairwise non adjacent edges. An edge set $S$ is

minimal (resp., maximal) with respect to a graph property if $S$ satisfies the graph property and any proper subset $S^{\prime} \subset S$ of $S$ (resp., any proper superset $S^{\prime} \supset S$ of $S$ ) does not satisfy the graph property. For instance, an edge set $S \subseteq E$ is a maximal matching (resp., minimal edge cover) if $S$ is a matching and $S+e$ is not a matching for some $e \in \bar{S}$ (resp., $S$ is an edge cover and $S-e$ is not an edge cover for some $e \in \bar{S}$ ).

Problem Definitions. Let $G=(V, E)$ be a graph where the minimum degree is at least $r \geq 1$, i.e., $\delta(G) \geq r$. We assume $r$ is a constant fixed greater than one (but all results given here hold even if $r$ depends on the graph). A $r$-DEGREE EDGE-COVER $^{3}$ is defined as a subset of edges $G^{\prime}=G_{S}=(V, S)$, such that each vertex of $G$ is incident to at least $r \geq 1$ distinct edges $e \in S$. As $r$-tolerant edge-cover (or simply $r$-tec) we will call an edge set $S \subseteq E$ if it is a minimal $r$-degree edge-cover i.e. if for every $e \in S, G^{\prime}-e=(V, S \backslash{e})$ is not an $r$-degree edge-cover. Alternatively, $\delta\left(G^{\prime}\right)=r$, and $\delta\left(G^{\prime}-e\right)=r-1$. If you seek the minimization version, all the problems are polynomial-time solvable. Actually, the case of $r=1$ corresponds to the edge cover in graphs. The optimization version of a generalization of $r$-EC known as the MIN LOWER-UPPER-COVER PROBLEM (MIN LUCP), consists of, given a graph $G$ where $G=(V, E)$ and two non-negative functions $a, b$ from $V$ such that $\forall v \in V, 0 \leq a(v) \leq b(v) \leq d_{G}(v)$, of finding a subset $M \subseteq E$ such that the partial graph $G_{M}=(V, M)$ induced by $M$ satisfies $a(v) \leq d_{G_{M}}(v) \leq b(v)$ (such a solution will be called a lowerupper-cover) and minimizing its total size $|M|$ among all such solutions (if any). Hence, an $r$-EC solution corresponds to a lower-upper-cover with $a(v)=r$ and $b(v)=d_{G}(v)$ for every $v \in V$. Min LUCP is known to be solvable in polynomial time even for edge-weighted graphs (Theorem $35.2$ in Chap. 35 of Volume A in [29]). We are considering two associated problems, formally described as follows.

## 数学代写|理论计算机代写theoretical computer science代考|Basic Properties of r-Tolerant Solutions

The next property presents a simple characterization of feasible $r$-tec solution generalizing the well known result given for minimal edge covers, i.e., 1-tec, affirming that $S$ is a 1-tec solution of $G$ if and only if $S$ is spanning and the subgraph $(V, S)$ induced by $S$ is $\left(K_{3}, P_{4}\right)$-free.

Property 1. Let $r \geq 1$ and let $G=(V, E)$ be a graph with minimum degree $\delta \geq r$. $S$ is an $r$-tec solution of $G$ if and only if the following conditions meet on $G_{S}=(V, S)$ :
(1) $V=V_{1}(S) \cup V_{2}(S)$ where $V_{1}(S)=\left{v \in V: d_{G_{S}}(v)=r\right}$ and $V_{2}(S)=\left{v \in V: d_{G_{S}}(v)>r\right}$.
(2) $V_{2}(S)$ is an independent set of $G_{S}$.

Proof. Let $r \geq 1$ be a fixed integer and let $G=(V, E)$ be a graph instance of UPPER $r$-EC, i.e., a graph of minimum degree at least $r$. Let us prove the necessary conditions: if $S \subseteq E$ is an $r$-tec solution, then by construction, $V=$ $V_{1}(S) \cup V_{2}(S)$ is a partition of vertices with minimum degree $r$ in $S$. Now, if $u v \in S$ with $u, v \in V_{2}(S)$, then $S-u v$ is also $r$-tec which is a contradiction of minimality.

Now, let us prove the other direction. Consider a subgraph $G^{\prime}=(V, S)$ induced by edge set $S$ satisfying $(1)$ and $(2)$. By (1) it is clear $G S$ has minimum degree at least $r$. If $u v \in S$, then by (2) one vertex, say $u \in V_{1}(S)$ because $V_{2}(S)$ is an independent set. Hence, the deletion of $u v$ leaves $u$ of degree $r-1$ in the subgraph induced by $G_{S \backslash{u v}}$ and then $S$ is an $r$-tec solution.

Property 2. Let $r \geq 1$, for all graphs $G=(V, E)$ of minimum degree at least $r$, the following inequality holds:
$$2 \mathrm{ec}{T}(G) \geq \operatorname{lec}{r}(G)$$
Proof. For a given graph $G=(V, E)$ with $n$ vertices, let $S^{}$ be an optimal solution of UPPER $r$-EC, that is $\left|S^{}\right|=\operatorname{uec}{r}(G)$. Let $\left(V{1}^{}, V_{2}^{}\right)$ be the associated partition related to solution $S^{}$ as indicated in Property 1. Using this characterization, we deduce $\operatorname{uec}{r}(G) \leq r\left|V{1}^{}\right| \leq r n$. On the other side, if $G^{\prime}$ denotes the subgraph induced by a minimum $r$-tec solution of value ec $(G)$, we get $2 \operatorname{ec}{r}(G)=\sum{v \in V} d_{G^{\prime}}(v) \geq r n$. Combining these two inequalities, the results follows.

In particular, inequality (1) of Property 2 shows that any $r$-tec solution is a $\frac{1}{2}$-approximation of UPPER $r$-EC.

The next property is quite natural for induced subgraphs and indicates that the size of an optimal solution of a maximization problem does not decrease with the size of the graph. Nevertheless, this property is false in general when we deal with partial subgraphs; for instance, for the upper domination number, we get $\operatorname{uds}\left(K_{3}\right)=1<2=\operatorname{uds}\left(P_{3}\right)$. It turns out that this inequality is valid for the upper edge cover number.

Property 3. Let $G=(V, E)$ be a graph such that $0<r \leq \delta(G)$. For every partial subgraph $G^{\prime} \subseteq G$ with $\delta\left(G^{\prime}\right) \geq r$, the following inequality holds:
$$\operatorname{uec}{r}(G) \geq \operatorname{uec}{\mathrm{r}}\left(G^{\prime}\right)$$
Proof. Fix an integer $r \geq 1$ and a graph $G=(V, E)$ with $\delta(G) \geq r$. Let $G^{\prime}=$ ( $V^{\prime}, E^{\prime}$ ) with $\delta\left(G^{\prime}\right) \geq r$ be a partial subgraph of $G$, i.e., $V^{\prime} \subseteq V$ and $E^{\prime} \subseteq E$. Consider an upper $r$-tec solution $S^{\prime}$ of $G^{\prime}$ with size $\left|S^{\prime}\right|=$ uec $_{r}\left(G^{\prime}\right)$. We prove inequality (2) by starting from $S=S^{\prime}$ and by iteratively repeating the following procedure:

1. Select a vertex $v \in V$ with $d_{G_{S}}(v)<r$ and $e=u v \in E \backslash S$.
2. If $u$ is covered less or more than $r$ times by $S$, then $S:=S+e$.
3. If vertex $u$ is covered exactly $r$ times by $S$, consider two cases.

## 数学代写|理论计算机代写theoretical computer science代考|Basic Properties of r-Tolerant Solutions

(1) 在=在1(小号)∪在2(小号)在哪里V_{1}(S)=\left{v \in V: d_{G_{S}}(v)=r\right}V_{1}(S)=\left{v \in V: d_{G_{S}}(v)=r\right}和V_{2}(S)=\left{v \in V: d_{G_{S}}(v)>r\right}V_{2}(S)=\left{v \in V: d_{G_{S}}(v)>r\right}.
(2) 在2(小号)是一个独立的集合G小号.

2和C吨(G)≥莱克⁡r(G)

1. 选择一个顶点在∈在和dG小号(在)<r和和=在在∈和∖小号.
2. 如果在被覆盖少于或多于r次由小号， 然后小号:=小号+和.
3. 如果顶点在被准确覆盖r次由小号，考虑两种情况。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考| Evaluation

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Dataset

To evaluate the proposed framework, we have fully implemented it and have run multiple experiments. We evaluate our method for two classification tasks: traffic characterization and application identification. Traffic characterization deals with determining the application category of a trace such as Chat, and the goal of application identification is to distinguish the application of a trace such as Skype. To evaluate the performance of the proposed framework, we have used Precision (Pr), Recall (Rc), and F1-Measure (F1). The mathematical formula for these metrics is provided in Eq. 4 .
$$\text { Recall }=\frac{T P}{T P+F N}, \text { Precision }=\frac{T P}{T P+F P}, F 1=\frac{2 * R c * P r}{R c+P r}$$

We have fully implemented the framework with Python 3. After filtering out the unnecessary packets, 5 -tuple which identifies each network flow along with timestamp and length are extracted for each packet of packet trace. Statistical features provided in Table 1 are extracted by Python Pandas libraries ${ }^{2}$ in the Network Unit Extraction module (Sect. 4.1). Kmeanst+ algorithm is used for the clustering purpose from the scikit-learn library ${ }^{3}$ and StandardScaler function is used for standardizing the feature sets from the same package in the Network Unit Clustering module (Sect. 4.1). To automate identifying the number of clusters of each protocol, we have leveraged KneeLocator from the Kneed library ${ }^{4}$. For implementing the Language Learner module (Sect. 4.2) we benefited from k-TSS language learner part of [11] implementation and modified it based on our purpose. Finally, we have used the grid search method to choose the best value for the framework’s hyper-parameters, including $s t, i t, f d$, stats, and $k$.

## 数学代写|理论计算机代写theoretical computer science代考|Classification Results

We have used a grid search to fine-tune the parameters $s t, f d$, it, stats, and $k$, called hyper-parameters in machine learning. Based on the average F1-Measure, consequently, the best values for the application identification task are $15 \mathrm{~s}, 5 \mathrm{~s}$, $10 \mathrm{~s}, 2$, and 3 for the hyper-parameters $s t, f d$, it, stats, and $k$, respectively, while for the traffic characterization task, they are $15 \mathrm{~s}, 15 \mathrm{~s}, 10 \mathrm{~s}, 2$, and 3 . Using the elbow method, the number of clusters for the most used protocol such as TCP is automatically set to 23 and for a less used protocol such as HTTP is set to six, resulting in 101 total number of clusters.

We have evaluated the framework on the test set with the chosen hyperparameters. The framework has gained weighted average F1-Measure of $97 \%$ for both tasks. Table 2 provides the proposed framework performance in terms of Precision, Recall, and F1-Measure in detail. In this section, we provide a comparison of our framework with other flow-based methods used for network traffic classification. For the application identification task, we have compared our work with [2]. In this comparison, we first have extracted 44 most used statistical flow-based features from the dataset, as the 111 initial features are not publicly available. Then, we have converted these instances to arff format to be used as Weka ${ }^{5}$ input. To have a fair comparison, we have used the ChiSquaredAttributeEval evaluator for feature selection and finally, applied all machine learning algorithms, including J48, Random Forest, $\mathrm{k}-\mathrm{NN}(\mathrm{k}=1)$, and Bayes Net with 10 fold cross-validation used in this work. Figure 4 a compares the proposed framework performance with [2] in terms of Precision, Recall, and F1-Measure for application identification task. In most

classes, our framework performed considerably better. Our framework gains much higher precision in identification of different applications in our dataset except for TeamViewer and JoinMe. In terms of recall, the work of [2] performed slightly better in detecting JoinMe and Ultra.

For the traffic characterization task, we made a comparison with the work of [1]. We have implemented a python code to extract time-related statistical flow-based features used by the authors. Using Weka, we have applied KNN and C4.5 in a 10-fold cross-validation setting which is the same as the one used in the paper. Figure $4 \mathrm{~b}$ compares the proposed framework performance with these

machine learning algorithms in terms of Precision, Recall, and F1-Measure for the traffic characterization task, showing that our work significantly outperforms the state-of-the-art work in application identification task.

As the proposed method in [7] is the only framework that has leveraged automata learning methods for application identification, we compare our work with it in terms of performance and time complexity (Table 3). Although the performance of [7] was computed in a two-class classification setting, NeTLang almost outperforms it in terms of Recall and F1-Measure. However, [7] has achieved higher overall precision because of its automata learning nature but owns time-consuming training and testing processes, while NeTLang considerably reduces the training time to the scale of minutes and it identifies application by observing only $4 k$ of a network trace.

## 数学代写|理论计算机代写theoretical computer science代考|Related Work

Machine Learning Based Methods in Traffic Classification. Many researchers applied the machine learning techniques in traffic classification. Time-related statistical flow-based features are used in [1] to characterize traffic in the presence/absence of a virtual private network (VPN). In this work, C4.5 decision tree and K-Nearest Neighbors are applied to the extracted features to classify flows. According to their result, the C4.5 algorithm performed better with the precision of $90 \%$ for traffic characterization.

Application identification from network flows is another traffic classification task performed in [2]. They used the CfsSubsetEval and ChiSquaredAttributeEval methods in Weka to optimize the number of features and supervised learning algorithms, including J48, Random Forest, k-NN, and Bayes Net. For the dataset they used, Random Forest and $\mathrm{k}$-NN had better performance in terms of accuracy $(90.87 \%$ and $93.94 \%$, respectively).

Deep Packet [3] is a recently proposed packet-level deep-learning-based framework that automatically extracts features for traffic classification. They used stacked autoencoders and one-dimensional CNN in their framework. This framework achieved a recall of $94 \%$ and $98 \%$ for traffic characterization and application identification tasks, respectively. The drawback of the deep-learningbased method is their time-consuming training process.

Automata Learning Based Methods in Traffic Classification. The most related work to ours is [7] in terms of utilizing a passive automata learning technique to derive the behavioral packet-level network models of applications. They provided a multi-steps algorithm consists of building an initial automaton and generalizing it by a state merging condition based on the behavior of well-known network protocols. Their fine granularity leads to produce large models, which make some steps of generalizing fulfill slowly for large models. Furthermore, the detection of an application was constrained to observing a complete trace of an application. Our method rectifies these shortcomings by the idea of learning a k-TSS language, which was inspired by [11]. Botnet detection has been studied in [8]. They passively learned automata for clusters of the dataset to characterize a communication profile. Other related work mainly use the automata learning for protocol modeling such as $[9,10]$ by active automata learning, while using the desired protocol implementation as their oracle system and alphabet of the automaton derived manually by an expert.

## 数学代写|理论计算机代写theoretical computer science代考|Dataset

记起 =吨磷吨磷+Fñ, 精确 =吨磷吨磷+F磷,F1=2∗RC∗磷rRC+磷r

## 数学代写|理论计算机代写theoretical computer science代考|Related Work

Deep Packet [3] 是最近提出的基于数据包级深度学习的框架，可自动提取特征进行流量分类。他们在他们的框架中使用了堆叠的自动编码器和一维 CNN。该框架实现了召回94%和98%分别用于流量表征和应用识别任务。基于深度学习的方法的缺点是训练过程耗时。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考|Methodology

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Methodology

Figure 2 illustrates the overall scheme of our methodology. There are three main modules Trace Generator, Language Learner, and Classifier. The trace generator module converts the packet traces of each application into a word list by first splitting the traces by using the timing parameters and then applying the Map function to identified network units. Then, the k-TSS network language of the application is learned from its words by the Language Learner module. By applying this process for all applications, a database of k-TSS languages is obtained. To classify traffic of a system, first, the Trace Generator module extracts its sessions and their network units using the timing parameters. Then, it converts the sessions to the symbolic words to prepare the inputs of the Language Learner module. Finally, the Classifier module compares the learned language of each session of the given traffic against the languages of the database to label them. We describe each module in detail in the following.

## 数学代写|理论计算机代写theoretical computer science代考|Trace Generator

The goal of the Trace Generator module is to transform the packet traces of each application into traces of smaller units, while preserving the relation among these units, as the letters of a language. To do so, at first, packet traces are split into smaller units, named network unit, by using the three timing parameters, introduced in Sect. 3. Then, these units are categorized based on their higherlayer protocol, and each category is clustered separately and named upon it. Consequently, the names of clusters constitute the Network Alphabet. Finally, the network alphabets are put in together to form our traces. Therefore, the input of this subproblem is $\Pi$ containing $m$ application traces and the timing parameters st, $f d$, and it, while the output is a set of word list $S=\left{W_{1}, W_{2}, \ldots, W_{m}\right}$, where $W_{i}$ is a list including the generated words of language $A p p_{i}$.

Network Unit Extraction. Network Units are extracted during three steps. At first, network traces are split into sessions based on the session threshold. Then, flow-sessions are extracted based on the inactive timeort. After that, flowsessions are divided into smaller units according to the flow duration constraint. Figure 1 shows the result of applying this module on a given trace to produce sessions, flow-sessions, and network units.

Network Unit Clustering. For naming network units, there is a need to group the similar ones and name them upon it. As there is no information on the number of units or their labels, an unsupervised machine learning algorithm named Kmeans $+t$ is used for clustering these units. The only information available for each flow are those stored in the frame header of its packets. Therefore, the clustering algorithm could benefit from the knowledge of the unit higher-layer protocol. Consequently, flows having the same protocol are at worst clustered together and named upon their protocol. The packets of each flow can be further clustered in terms of multiple statistical features listed in Table 1. Two groups of features are used for clustering network unit, the first group consists of features number one to six, and the second group includes features number three to nine. Regarding these two groups, the statistical features group (stats) is another configuration of this problem.

Moreover, it is required to set the number of clusters before performing the Kmeanst+ algorithm. To do so, the number of clusters is automatically determined with the help of the elbow [17] method in our approach.

Our defined Map function returns a concatenation of the symbol abbreviating packet protocol name and a natural number indicating its cluster number (to distinguish network units). For example, TL-5 corresponds to a packet sequence belonging to cluster 5 of the TLS protocol.

## 数学代写|理论计算机代写theoretical computer science代考|Language Learner

After preparation tasks, we obtain the modified word list $W^{\prime}=$ $\left[w_{1}^{\prime}, w_{2}^{\prime}, \ldots, w_{n}^{\prime}\right]$. Now, we learn a list of k-TSS languages for each $w^{\prime} \in$ $W^{\prime}$ as $\mathcal{L}\left(w^{\prime}\right)$ by a k-size frame scanner. For instance, for the given $w^{\prime}=$ SL-4 SL-2 T-10 TL-2 T-1 U-2, $\mathcal{L}\left(w^{\prime}\right)$ is obtained by k-TSS vector $Z\left(w^{\prime}\right)$ as $\langle\Sigma={\mathrm{SL}-2, \mathrm{TL}-2, \mathrm{~T}-1, \mathrm{U}-2, \mathrm{SL}-4, \mathrm{~T}-10}, I={\mathrm{SL}-4 \mathrm{SL}-2,}, F=$, ${\mathrm{T}-1 \mathrm{U}-2}, T={\mathrm{SL}-4 \mathrm{SL}-2 \mathrm{~T}-10, \mathrm{SL}-2 \mathrm{~T}-10 \mathrm{TL}-2, \mathrm{~T}-10 \mathrm{TL}-2 \mathrm{~T}-1, \mathrm{TL}-2$ T-1 U-2} .

To specify each word’s class, we observe its segments instead of exploring the whole word to be more noise tolerable. To this aim, we reduce the classification problem to find the similarity between the generated k-TTS languages of the trace and the ones in the database. For each k-TSS language in the database, we calculate its proximity with $\mathcal{L}\left(w^{\prime}\right)$ by a distance function defined as:

Definition 6 (Distance Function). Distance function $D$ measures the proximity metric between two k-TSS languages. Let $Z=\langle\Sigma, I, F, T\rangle$ be the $k$-test vector of $\mathcal{L}\left(w^{\prime}\right)$ and $Z_{i}=\left\langle\Sigma_{i}, I_{i}, F_{i}, T_{i}\right\rangle$ be the k-test vector of $\mathcal{L}\left(A p p_{i}\right)$. Then, $D\left(\mathcal{L}\left(w^{\prime}\right), \mathcal{L}\left(A p p_{i}\right)\right)$ is computed by five auciliary variables, measuring the sets diffenence fraction: $\triangle T, \Delta T_{i}, \Delta \Sigma, \triangle I$ and $\triangle F$ (defined in Eq. 1).
$\Delta T=\frac{T-T_{i}}{T}, \Delta T_{i}=\frac{T_{i}-T}{T_{i}}, \Delta \Sigma=\frac{\Sigma-\Sigma_{i}}{\Sigma}, \Delta I=\frac{I-I_{i}}{I}, \Delta F=\frac{F-F_{i}}{F} .$
$D\left(\mathcal{L}\left(w^{\prime}\right), \mathcal{L}\left(A p p_{i}\right)\right)=\overline{\Delta^{\prime} T} \overline{\triangle^{\prime} T_{i}} \overline{\Delta^{\prime} \Sigma} \overline{\triangle^{\prime} I} \overline{\triangle^{\prime} F}$
We assume a priority among these delta metrics as $\triangle T, \Delta T_{i}, \triangle \Sigma, \triangle I$ and $\triangle F$. Since $T$ carries more information of words than $I$ and $F$, we assign the highest priority to $\triangle T$. Also, we give $\triangle T$ higher priority than $\triangle T_{i}$, as $T$ is generated from a test trace in contrast to $T_{i}$ which is generated from a number of traces of an application. We specify the next priority to $\triangle \Sigma$ to take into account the alphabet sets differences. We assign the next priorities to $\triangle I$ and then $\triangle F$, respectively, because the trimming operation of the preparation phase may impact on the end of the words while their initials are not modified. Finally, we convert the value of delta metrics from a float number in the range $[0,1]$ to an integer number in the range $[0,99]$, renaming them by a prime sign, for example, $\triangle^{\prime} T$. By this priority, the distance function is defined as given by Eq. 2 .

A word $w$ is categorized in class $j$ if the k-TSS language of $w$ has the minimum distance with the k-TSS language of $A p p_{j}$ among all the applications:
$$\operatorname{Class}\left(w^{\prime}\right)=j \Longleftrightarrow \text { if } D\left(\mathcal{L}\left(w^{\prime}\right), \mathcal{L}\left(A p p_{j}\right)\right)=\operatorname{argmin}{\forall A p p{i} \in|\mathcal{A}|}\left(D\left(\mathcal{L}\left(w^{\prime}\right), \mathcal{L}\left(A p p_{i}\right)\right)\right)$$
Considering the running example $\left(\mathcal{L}\left(A p p_{i}\right)\right.$ in Sect. $4.2$ and $\left.\mathcal{L}\left(w^{\prime}\right)\right)$, the defined delta metrics are obtained as $\triangle T=0.75\left(\triangle^{\prime} T=74\right), \Delta T_{i}=0.85$ $\left(\triangle^{\prime} T_{i}=84\right), \triangle \Sigma=0.16\left(\triangle^{\prime} \Sigma=16\right), \triangle I=0\left(\triangle^{\prime} I=0\right)$, and $\triangle F=1\left(\triangle^{\prime} F=\right.$ $99)$ and finally, the distance is computed as $D\left(\mathcal{L}\left(w^{\prime}\right), \mathcal{L}\left(A p p_{i}\right)\right)=7484160099 .$

## 数学代写|理论计算机代写theoretical computer science代考|Trace Generator

Trace Generator 模块的目标是将每个应用程序的数据包跟踪转换为较小单元的跟踪，同时将这些单元之间的关系保留为语言的字母。为此，首先，通过使用第 3 节中介绍的三个时间参数，将数据包跟踪分成更小的单元，称为网络单元。3. 然后，这些单元根据它们的高层协议进行分类，每个类别被单独聚类并命名。因此，集群的名称构成了网络字母表。最后，将网络字母组合在一起形成我们的踪迹。因此，这个子问题的输入是圆周率包含米应用程序跟踪和时序参数 st，Fd, 而它, 而输出是一组单词列表S=\left{W_{1}, W_{2}, \ldots, W_{m}\right}S=\left{W_{1}, W_{2}, \ldots, W_{m}\right}， 在哪里在一世是一个列表，包括生成的语言单词一种pp一世.

## 数学代写|理论计算机代写theoretical computer science代考|Language Learner

Δ吨=吨−吨一世吨,Δ吨一世=吨一世−吨吨一世,ΔΣ=Σ−Σ一世Σ,Δ一世=一世−一世一世一世,ΔF=F−F一世F.
D(大号(在′),大号(一种pp一世))=Δ′吨¯△′吨一世¯Δ′Σ¯△′一世¯△′F¯

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考|Implementation and Experimental Results

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Implementation and Experimental Results

We implemented our heuristics as a package in the PRISM model checker. Our implementations are based on sparse engine of PRISM [9] which is developed in $\mathrm{C}$ and are available in [18]. We used several standard case studies which have been used in previous works $[2,4,10,13,15,16]$. We compare the running time of our heuristics with the running time of well-known previous methods. We only focus on the running time of the iterative computations for quantitative properties. We exclude the running times for model constructions that are the same for all methods. The running time of the graph-based pre-computation are negligible in appropriate implementations [7]. For SCC-based methods and our Dirac-based ones, the reported times include the running times for SCC decomposition and Dirac-based reductions. All benchmarks have been run on a machine with Corei7 CPU (2.8 GHz, 4 main cores) and 8 GB RAM running Ubuntu 18. We use Consensus, Zeroconf, finewire_abstract, brp, nand and Crowds case studies for comparing the performance of iterative methods for reachability probabilities and use Wlan, CSMA, Israeli-jalefon and Leader cases for expected rewards. These case studies are explained in $[4,15,16]$. More details about our experiments and model parameters are available as log-files in [18]. Although there are several other standard case studies, their graphical structure do not have any cycle and a topological backward iterative method can be used to computed their underlying properties in linear time $[13,14]$. Hence, we focus on the case studies of Table 1 , where have several non-trivial cycles.

We consider the running time of the standard iterative methods and well known improved techniques from previous works. To perform a fair comparison, we use sparse engine of PRISM for all experiments and we also implemented topological (SCC-based) method for this engine. In this case, we use MPI to solve each SCC. For learning-based methods, we use the implementation that is proposed in [15] and for backward value iteration, we implement the proposed method from [14]. For all case studies, we consider $\epsilon=10^{-8}$ as the threshold.
Table 1 shows the running time of the iterative methods for MDP models. For the SCC-based method, we use MPI to approximate the reachability probability values of each SCCs. All times are in seconds. We use * where a method does not terminate after an hour. For consensus, Israeli-jalefon, Leader and Wlan models, the running time of SCC-based method is less than the others. In these cases, we use our Dirac-based method to reduce the running time of the computations for each SCC. The results show that for these two classes, our technique is faster than the other methods. The learning-based method is faster than other methods for zeroconf cases with $\mathrm{N}=20$. For firewire case studies, $\mathrm{SCC}$ based and backward value iteration methods are much faster than the standard iterative methods. In this case Dirac-based method for MPI (without SCC decomposition) works better than the other methods.

## 数学代写|理论计算机代写theoretical computer science代考|Learning for Network Traffic Classification

Characterizing the network traffic and identifying running applications play an important role in several network administration tasks such as protecting against malicious behaviors, firewalling, and balancing bandwidth usage. Recently, dynamic port assignment and encryption protocol usage have considerably reduced the performance of the classic traffic classification methods, including port-based and deep packet inspection. This leads researchers to apply Machine-Learning techniques and behavioral pattern detection for traffic classification. Machine-Learning approaches classify network traffic based on statistical

features with the granularity of flow [1,2] or packet [3], and hence, they ignore temporal relations among flows, and as a result, their false positive rates are not negligible although they are fast. In behavioral classification methods [46 , an expert extracts specific behavioral aspects of a particular application or application type for the classification purpose. For instance, link establishment topology was used as the distinctive metric to classify P2P-TV traffic in [5].
Assuming a packet trace as a word of the language of an application, one can derive an automaton modeling the traffic behavior of that application. Automata learning approaches have been recently used to automatically derive the model of applications $[7,8]$, network protocols $[9,10]$, or Botnet behavior [8]. The alphabets of the learned automata are either manually defined by domain experts which is not straightforward, or in terms of packets which may cause overfitting. As some packets always appear together, we can consider a sequence of related packets together as a symbol of the alphabet. To this aim, we apply machine learning techniques to automatically define the alphabet set based on the timing and statistical features of packets.

The derived automata are used for traffic classification. A packet trace is classified into an application if the model of that application accepts it. Using automata learning methods, the classification problem is constrained to observe the complete trace of an application to verdict its acceptance/rejection. To tackle this challenge, inspired by [11], we upgrade the detection of an application based on partial observation of a trace, a window of size $k$, and derive a model that accepts a k-Testable language in the Strict Sense (k-TSS) [12]. K-TSS, a class of regular languages, also known as window language, allows to locally accept or reject a word by a sliding window parser of size $k$. We relax the acceptance condition of automata learning using machine learning by defining a proximity metric to be compatible with the local essence of the learned language. The proposed proximity metric is defined as a distance function. We have implemented our approach in a framework called Network Traffic Language learner, NeTLang. We evaluate the performance of our approach by applying it to real-world network traffic and compare it with machine and automata learning approaches. We achieved F1-Measure of $97 \%$ for both application identification and traffic characterization tasks. In summary, first, we learn the alphabet using a machine learning technique. Then, the network language is learned through an automata learning approach. Finally, the classifier identifies the classes based on our defined distance metrics on the input and the learned models. Our method makes these contributions: 1) Utilizing locally testable language learning in the traffic classification problem, 2) Extracting the domain-based alphabet automatically, 3) Upgrading the word acceptance by a new proximity metric. With these contributions, the following improvements are brought into traffic classification:

• Considering a sequence of related packets as the appropriate granularity of the problem, instead of per-packet detection which is too fine-grained or per-flow detection which is too coarse-grained,Providing highly accurate models for applications as our automata learning approach considers the temporal relation among flows and the way they are interleaved,
• Decreasing the classification time by considering only some first packets of a trace with a help of a novel distance function.

## 数学代写|理论计算机代写theoretical computer science代考|Background on Automata Learning

In this section, we provide some background on automata learning concepts used in our methodology. Learning a regular language from given positive samples (words belonging to the language) is a common problem in grammatical inference. To solve this problem, many algorithms were proposed to find the smallest deterministic finite automaton (DFA) that accepts the positive examples. In this paper, we focus on learning k-testable languages in the strict sense, a subset of a regular language, called k-TSS, initially was introduced by [12]. In such a language, words are determined by allowed three sets of prefixes and suffixes of length $k-1$ and substrings of length $k$. It has been proven that it is possible to learn k-TSS languages in the limit [13]. To learn this language, the only effort is to scan the accepting words while simultaneously constructing the allowed three set. The locally testable feature of this language makes it appropriate for network traffic classification and other domains, such as pattern recognition [14] and DNA sequence analysis [15]. In the following, we provide the formal definition of k-TSS language taken from [16].

Definition 1 (k-test Vector). Let $k>0, a$-test vector is determined by $a$ 5 -tuple $Z=\langle\Sigma, I, F, T, C\rangle$ where

• $\Sigma$ is a finite alphabet,
• $I \subseteq \Sigma^{(k-1)}$ is a set of allowed prefixes of length less than $k$,
• $F \subseteq \Sigma^{(k-1)}$ is a set of allowed suffixes of length less than $k$,
• $T \subseteq \Sigma^{k}$ is a set of allowed segments, and
• $C \subseteq \Sigma^{0$. Then Language $\mathcal{L}(Z)$ in the strict sense $(k-T S S)$ is computed as:
$$\mathcal{L}(Z)=\left[\left(I \Sigma^{} \cap \Sigma^{} F\right)-\Sigma^{}\left(\Sigma^{k}-T\right) \Sigma^{}\right] \cup C$$
For instance, consider $k=3$ and $Z=\langle\Sigma={a, b}, I={a b}, F=$ ${a b, b a}, T={a b a, a b b, b b a}, C={a b}\rangle$, then, $a b a, a b b a \in \mathcal{L}(Z)$ since they are preserving the allowed sets of $Z$, while $b a b, a b b, a b a b, a$ do not belong to $\mathcal{L}(z)$ because, in order, they violate $I \mathrm{~ ( b a ~}$ $(a \notin C$ ). To construct the k-test vector of a language, we scan the accepted word by a k-size frame. By scanning the word $a b b a$ by a 3 -size frame, $a b, b a$, and $a b b, b b a$ are added to $I, F$, and $T$, respectively.

In our problem, we produce words such that their length is greater than or equal to $k$. It means that $C$ always is empty. Hence, for simplicity, we eliminate $C$ from the k-TSS vector for the rest of the paper.、

## 数学代写|理论计算机代写theoretical computer science代考|Learning for Network Traffic Classification

• 考虑一系列相关数据包作为问题的适当粒度，而不是太细粒度的每个数据包检测或太粗粒度的每个流检测，为我们的自动机学习方法考虑的应用程序提供高精度模型流之间的时间关系以及它们交错的方式，
• 在新的距离函数的帮助下，通过仅考虑跟踪的一些第一个数据包来减少分类时间。

## 数学代写|理论计算机代写theoretical computer science代考|Background on Automata Learning

• Σ是一个有限的字母表，
• 一世⊆Σ(ķ−1)是一组长度小于的允许前缀ķ,
• F⊆Σ(ķ−1)是一组允许的长度小于的后缀ķ,
• 吨⊆Σķ是一组允许的段，并且
• $C \subseteq \Sigma^{0.吨H和n大号一种nG在一种G和\数学{L} (Z)一世n吨H和s吨r一世C吨s和ns和(千吨不锈钢)一世sC这米p在吨和d一种s:大号(从)=[(一世Σ∩ΣF)−Σ(Σķ−吨)Σ]∪CF这r一世ns吨一种nC和,C这ns一世d和rk=3一种ndZ=\langle\Sigma={a, b}, I={ab}, F={ab, ba}, T={aba, abb, bba}, C={ab}\rangle,吨H和n,aba, abba \in \mathcal{L}(Z)s一世nC和吨H和是一种r和pr和s和r在一世nG吨H和一种ll这在和ds和吨s这F从,在H一世l和bab, abb, abab, 一个d这n这吨b和l这nG吨这\数学{L} (z)b和C一种在s和,一世n这rd和r,吨H和是在一世这l一种吨和我 \mathrm{~ ( ba ~}(a \notin C).吨这C这ns吨r在C吨吨H和ķ−吨和s吨在和C吨这r这F一种l一种nG在一种G和,在和sC一种n吨H和一种CC和p吨和d在这rdb是一种ķ−s一世和和Fr一种米和.乙是sC一种nn一世nG吨H和在这rd阿爸b是一种3−s一世和和Fr一种米和,呸呸呸,一种ndabb, bba一种r和一种dd和d吨这如果,一种ndT$，分别。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考| Improving Iterative Methods for Quantitative

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Reachability Probabilities

We use a heuristic to improve the performance of iterative methods for DTMCs. It considers transitions with Dirac distributions. For this class of transitions, the reachability probability values of the source and destination states of the transition are the same. The idea of our heuristic is to avoid useless updates in order to reduce the total number of updates before termination. We use it to improve the performance of policy iteration method for computing reachability probabilities in MDPs. In this case, Dirac transitions are considered to reduce the number of updates in the computations of each quotient DTMCs. We apply this technique on the SCC-based method to improve the performance of this approach. Although the idea of considering Dirac transitions has been used as a reduction technique in [8], it has been only proposed for the case of deterministic Dirac transitions. The main contribution of our work is to use this heuristic for improving the MPI method for both reachability probabilities and expected rewards to cover nondeterministic action selections. To the bet of our knowledge, none of state of the art model checkers support this technique.

## 数学代写|理论计算机代写theoretical computer science代考|Using Transitions with Dirac Probability

In our work, we consider transitions with Dirac distribution (which we call Dirac transitions) to avoid redundant updates of state values. This type of transitions is also used in statistical model checking [6] but we use it to accelerate iterative methods. For a DTMC D, a Dirac transition is a pair of states $s$ and $s^{\prime}$ with $\mathbf{P}\left(s, s^{\prime}\right)=1$. Based on the definition, for this pair of states, we have $\operatorname{Pr}^{D}\left(\right.$ reach $\left.{s}(G)\right)=\operatorname{Pr}^{D}\left(\right.$ reach $\left.{s^{\prime}}(G)\right)$. As a result, we can ignore this transition and postpone the update of the value of $s$ until the convergence of the value of $s^{\prime}$. Our approach updates the value of $s$ only once and avoids redundant updates. However, there may be some incoming transitions to $s$ that need the value of $s$ in every iteration. Consider Fig. 1 and suppose $s$ is the target state of a transition of the form $(t, s)$. In this case the value of $s$ affects the value of $t$ and we should modify the incoming transitions to $s$ to point to $s^{\prime}$. Algorithm 2 shows details of our approach to remove Dirac transitions from a DTMC $D$ and reduce it to smaller DTMC $D^{\prime}$. The main idea of this algorithm is to partition the state space of $D$ to Dirac-based classes. A Dirac-based class is the set of states that are connected by Dirac transitions. The algorithm uses the $D B C$ array to partition $S^{?}$ to the related classes. Consider for example a sequence of states of the form $s_{i}, s_{j}, \ldots, s_{k-1}, s_{k}$ where there exists Dirac transitions between each two consecutive states and the outgoing transition of $s_{k}$ is not a Dirac one or $s_{k}$ is one of $s_{i}, \ldots, s_{k-1}$. We put these states into one class and set their $D B C$ to $s_{k}$. The reachability probability of all states of this class are equal to the reachability probability of $s_{k}$. Formally, for each state $s_{i} \in S$, we define $D B C\left[s_{i}\right]$ as the index of the last state in the sequence of states that are connected with Dirac transitions. For every state $s \in S$ we have $x_{s}=x_{D B C[s]}$. Algorithm 2 initiates the values of this array in lines 3-5. For every state $s_{i} \in S$ that $D B C\left[s_{i}\right]$ is not determined previously, the algorithm calls the Find_Dirac_Index function to determine its $D B C$. After determining the $D B C$ value of all states of $S$, the algorithm creates the reduced DTMC $D^{\prime}$ (lines $9-13$ ). The states of $D^{\prime}$ are those states $s_{i} \in S$ for which $D B C\left[s_{i}\right]=i$. According to the definition of $S^{1}$, for each state $s \in G$ and each $s^{\prime} \in S$ if $D B C\left[s^{\prime}\right]=D B C[s]$ then we have $s^{\prime} \in G$.

## 数学代写|理论计算机代写theoretical computer science代考|Improving Iterative Methods for Computing Reachability

The Dirac-based reduction technique can be also used for MDPs to avoid redundant updates. However, it can not be directly applied for states with multiple enabled actions. To cover this case and as a novelty of our approach, we use the MPI method. We apply our heuristic for every quotient DTMC to reduce the number of states that should be updated. We propose our it as an extension of SCC-based techniques. We use MPI to compute reachability values of the states of each SCC and apply our Dirac-based heuristic to accelerate the iterative computations of each quotient DTMC. Algorithm 5 presents the overall idea of our method for accelerating SCC-based methods for MDPs. The correctness of this approach relies on the correctness of SCC decomposition for MDPs [13] and the correctness of our Dirac-based DTMC reduction method (Lemma 1 ).

The standard iterative methods can be used to approximate the Bellman equation for extremal expected rewards. In this case, the initial vector of values is set to zero for all states. Value iteration (or policy iteration) should also consider the defined reward of each action in the update of values of each state. More details about iterative methods for expected rewards are available in $[1,2,16]$. To use our heuristic for expected rewards, we use the fact that for every Dirac transition of the form $\left(s_{i}, s_{j}\right)$ we have $x_{s_{i}}=x_{s_{j}}+R\left(x_{i}\right)$ where $x_{i}$ and $x_{j}$ are the expected values for these two states and $R\left(x_{i}\right)$ is the reward for $x_{i}$. In this case, an iterative method does not need to update the value of $x_{i}$ in every iteration. Similar to the case for the reachability probabilities, the incoming transitions to $x_{i}$ should be modified to point to $x_{j}$. In addition the reward of a modified transition is modified by adding the reward of the related Dirac transition.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考| Probabilistic Model Checking

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Probabilistic Model Checking

The aim of probabilistic model checking is to verify a desired quantitative or qualitative property of the system. A main class of PCTL properties includes reachability probabilities and expected rewards. For DTMCs, a reachability probability is the probability of reaching the set of goal states $G$ and for MDPs, it is the extremal probability of reaching $G$. Reward-based properties are defined as the expected accumulated reward (extremal expected reward in the case of MDPs) until reaching a goal state [1]. To formally reason about reachability probabilities, we need to define a probability measure on the set of paths that reach to some states in $G$. For a state $s_{0} \in S$, let reach ${ }{s{0}}(G)$ be the set of all paths that start from $s_{0}$ and have a state from $G$ :
$$\text { reach }{s{0}}(G) \stackrel{\text { def }}{=}\left{\pi \in P a t h s_{D_{1} s_{0}} \mid \pi[i] \in G \text { for some } i \geq 0\right} .$$
The probability measure on the set $\operatorname{reach}{s{0}}(G)$ is defined as [1]:
$$\operatorname{Pr}^{D}\left(\operatorname{reach}{s{0}}(G)\right)=\sum_{s_{0} s_{1} \ldots s_{n} \in \text { reach } x_{0}}(G) \operatorname{Pr}^{D}\left(\operatorname{Cyl}\left(s_{0} s_{1} \ldots s_{n}\right)\right)$$
For MDPs, reachability probabilities are defined as the extremal probabilities of reaching goal states $[13,16]$ :
$$P r_{\max }^{M}\left(\text { reach }{s{0}}(G)\right) \stackrel{\text { def }}{=} \sup {\sigma \in \operatorname{Pol}{M}} \operatorname{Pr}{\sigma}^{M}\left(\operatorname{reach}{s_{0}}(G)\right)$$
Reachability probability properties are divided into qualitative and quantitative properties. A qualitative property of a probabilistic system means the probability of reaching the set of goal states is either 0 or 1 [1]. Qualitative verification is a method to find the set of states for which this reachability probability is exactly 0 or 1 . We denote these sets by $S^{0}$ and $S^{1}$, respectively. For the case of MDPs, we are interested to find those states for which the maximum or minimum reachability probability is 0 or 1 . These sets are denoted by $S_{\text {min }}^{0}$, $S_{\max }^{0}, S_{\min }^{1}$ and $S_{\max }^{1}$ and can be computed by graph-based methods [2].

## 数学代写|理论计算机代写theoretical computer science代考|Quantitative Properties

In probabilistic model checking, we consider a $\mathrm{PCTL}$ property to be quantitative if the probability of reaching goal states is not exactly 0 or 1 . Verification of quantitative properties usually reduces to solving a linear time equation system (for DTMCs) or solving a Bellman equation for MDPs [3,5]. For an arbitrary state $s \in S$ in a DTMC $D$, let $x_{s}$ be the probability of reaching $G$ from $s$, i.e., $x_{s}=P r^{D}($ reach $s(G))$. The values of $x_{s}$ for all $s \in S$ are obtained as the unique solution of the linear equation system $[1,2]$ :
$$x_{s}=\left{\begin{array}{lll} 0 & \text { if } & s \in S^{0} \ 1 & \text { if } & s \in S^{1} \ \sum_{s^{\prime} \in S} \mathbf{P}\left(s, s^{\prime}\right) \cdot x_{s^{\prime}} & \text { if } & s \in S^{?} \end{array}\right.$$
where $S^{?}=S \backslash\left(S^{0} \cup S^{1}\right)$. A model checker can use any direct method (e.g. Gaussian elimination) or iterative method (e.g. Jacobi, Gauss-Seidel) to compute the solution of this linear equation system.

For the case of MDPs, we consider $x_{s}$ as the maximum (or minimum) probability of reaching $G$, i.e., $x_{s}=P r_{\max }^{M}\left(\right.$ reach $\left.h_{s}(G)\right)$. In this case, the values of $x_{s}$ are obtained as the solution of the Bellman equation system:
where $S_{\max }^{?}=S \backslash\left(S_{\max }^{0} \cup S_{\max }^{1}\right)$. Using $x_{s}$ for the maximal expected accumulated reward, we have $[1,2]$ :
$$x_{s}= \begin{cases}0 & \text { if } s \in G \ \infty & \text { if } s \notin S_{\min }^{1} \ \max {\alpha \in A c t(s)} \sum{s^{\prime} \in S}\left(R\left(s, \alpha, s^{\prime}\right)+\delta(s, \alpha)\left(s^{\prime}\right) \cdot x_{s^{\prime}}\right) & \text { otherwise }\end{cases}$$
Some standard direct algorithms (like Simplex algorithm [11]) are able to compute the precise values for the Bellman equation. The main drawback of direct methods is their scalability that limits them to relatively small models [2]. Iterative methods are other alternatives to approximate the values of $x_{s^{*}}$.

## 数学代写|理论计算机代写theoretical computer science代考|Iterative Methods for Quantitative Reachability Probabilities

Value iteration (VI) and policy iteration (PI) are two standard iterative methods that are used to compute the quantitative properties in probabilistic systems. VI and its Gauss-Seidel extension are widely studied in the previous works $[1,2$, $13,15,16]$. We review PI, which is used in the remaining of this paper.

Policy Iteration. This method iterates over policies in order to find the optimal policy that maximizes reachability probabilities of all states. Starting from an arbitrary policy, it improves policies until reaching no change in them [2]. For each policy, the method uses an iterative method to compute the reachability probability values of quotient DTMCs and updates the value of states of the original MDP. The termination of this method is guaranteed for every finite MDP [2]. Modified policy iteration (MPI) [5] performs a limited number of iterations for each quotient DTMC (100 iterations for example) and updates the policy after doing this number of iterations. Algorithm 1 shows the standard policy iteration to compute $\operatorname{Pr}{\max }^{M}\left(\right.$ reach $\left.{s}(G)\right)$. We consider a policy as a mapping $\sigma$ from states to actions. In lines $7-9$, the algorithm uses a greedy approach to update the policy $\sigma$. More details about this algorithm are available in [2].

## 数学代写|理论计算机代写theoretical computer science代考|Quantitative Properties

$$x_{s}=\左{0 如果 s∈小号0 1 如果 s∈小号1 ∑s′∈小号磷(s,s′)⋅Xs′ 如果 s∈小号?\对。$$

$$x_{s}= \begin{cases}0 & \text { if } s \in G \ \infty & \text { if } s \notin S_{\min }^{1} \ \max {\ alpha \in A ct(s)} \sum {s^{\prime} \in S}\left(R\left(s, \alpha, s^{\prime}\right)+\delta(s, \ alpha)\left(s^{\prime}\right) \cdot x_{s^{\prime}}\right) & \text { else }\end{cases}$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 数学代写|理论计算机代写theoretical computer science代考|Dirac-Based Reduction Techniques for Quantitative Analysis of Discrete-Time Markov Models

statistics-lab™ 为您的留学生涯保驾护航 在代写理论计算机theoretical computer science方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写理论计算机theoretical computer science代写方面经验极为丰富，各种代写理论计算机theoretical computer science相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 数学代写|理论计算机代写theoretical computer science代考|Introduction

Model checking is a formal approach for verifying quantitative and qualitative properties of computer systems. In this way, the system is modelled by a labelled transition system, and its properties are specified in temporal logic. Because of some stochastic behaviours of these systems, we can use probabilistic model checking to analyse the quantitative property specifications of these systems [1-3]. In this domain, we can use discrete and continuous time Markov Chains to model fully probabilistic systems. Besides, Markov Decision Processes [5] are used to model systems with both probabilistic and non-deterministic behaviours. Probabilistic Computation Tree Logic (PCTL) [1] is used to formally specify the

related system properties. A main part of PCTL properties against MDPs can be verified by computing the extremal reachability probability: The maximal or minimal probability of reaching a given set of goal states. For quantitative parts, numerical computations are needed to calculate these reachability probabilities $[2,6]$. Linear programming $[1,3]$, value iteration and policy iteration are wellknown numerical approaches for computing the optimal reachability probabilities $[2,5]$. PRISM [4] and STORM [7] are examples of probabilistic model checkers that use these numerical methods to compute reachability probabilities.

One of the main challenges of model checking in all variants is the state space explosion problem, i.e., the excessive space requirement to store the states of the model in the memory $[1,9]$. For probabilistic model checking, we have the additional difficulty of solving linear programs. For the feasibility of the algorithms we need efficient heuristics to decrease the running time of these algorithms [3]. A wide range of approaches has been proposed for probabilistic model checking in previous works to tackle these problems. Symbolic model checking [9], compositional verification [10], symmetry reduction for probabilistic systems $[11]$, incremental model construction [12] and statistical model checking [6] have been proposed to reduce the needed space for probabilistic model checking. In addition, several approaches are used to accelerate standard algorithms for probabilistic model checking. SCC-based approaches $[13,14]$ identify strongly connected components (SCCs) of the underlying model and compute reachability probabilities of the states of each component in a right order. Learning based algorithms use the idea of real-time dynamic programming to solve reachability probability problems of MDPs [15]. Prioritization methods focus on finding a good state ordering to update the values of states during iterative computations $[13,14]$. The idea of finding Maximal End Components (MECs) is used in [3] to reduce the number of states of the model. Several techniques are proposed in [17] to reduce the size of a DTMC model. These techniques are used to reduce the model for finite-horizon properties.

## 数学代写|理论计算机代写theoretical computer science代考|Preliminaries

In this section, we provide an overview of DTMCs and MDPs and reachability properties. We mainly follow the notations of $[1,13]$. Let $S$ be a countable set. A discrete probability distribution on $S$ is a function $P: S \rightarrow[0,1]$ satisfying $\sum_{s \in S} P(s)=1$. We use $\operatorname{Dist}(S)$ as the set of all distributions on $S$. The support of $P$ is defined as the set $S u p p(P)={s \in S \mid P(s)>0}$. A distribution $P$ is Dirac if $S u p p(P)$ has only one member. More details about the proposed definitions in this section and their related probability measures are available in $[1,2,16]$.

Definition 1. A Discrete-time Markov Chain (DTMC) is a tuple $D=(S, \hat{s}$, $\mathbf{P}, R, G$ ) where $S$ is a countable, non-empty set of states, $\hat{s} \in S$ is the initial state, $\mathbf{P}: S \times S \rightarrow[0,1]$ is the probabilistic transition function, $R: S \times S \rightarrow \Re_{\geq 0}$ is a reward function wich assigns to each transition of $P$ a non-negative reward value and $G \subseteq S$ is the set of Goal states.

A DTMC $D$ is called finite if $S$ is finite. For a finite $D$, size $(D)$ is the number of states of $D$ plus the number of transitions of the form $\left(s, s^{\prime}\right) \in S \times S$ with $\mathbf{P}$ $\left(s, s^{\prime}\right)>0$. A path represents a possible execution of $D[2]$ and is a non-empty (finite or infinite) sequence of states $s_{0} s_{1} s_{2} \ldots$ such that $\mathbf{P}\left(s_{i}, s_{i+1}\right)>0$ for all $i \geq 0$. We use $P_{a t h s_{D, s}}$ to denote the set of all paths of $D$ that start in the state $s$ and we use FPaths ${ }{D, s}$ for the subset of finite paths of $P{a t h} s_{D, s}$. We also use Path $s_{D}$ and FPaths $_{D}$ for $\cup_{s \in S} P a t h s_{D, s}$ and $\cup_{s \in S} F P a t h s_{D, s}$ respectively. For a finite path $\pi=s_{0} s_{1} \ldots s_{k}$, the accumulated reward is defined as: $\sum_{i<k} R\left(s_{i}, s_{i+1}\right)$. For an infinite path $\pi=s_{0} s_{1} \ldots$ and for every $j \geq 0$, let $\pi[j]=s_{j}$ denote the $(j+1)$ th state of $\pi$ and $\pi[. . j]$ the $(j+1)$ th prefix of the form $s_{0} s_{1} \ldots s_{j}$ of $\pi$. We use pref $(\pi)$ as the set of all prefixes of $\pi$.

## 数学代写|理论计算机代写theoretical computer science代考|Probability Measure of a Markov Chain

In order to reason about the behaviour of a Markov chain $D$, we need to formally use the cylinder sets of the finite paths of $D[1]$.

Definition 2. The Cylinder set of a finite path $\hat{\pi} \in F P a$ h.s $_{D}$ is defined as $\operatorname{Cyl}(\hat{\pi})=\left{\pi \in\right.$ Paths $\left.{D} \mid \hat{\pi} \in \operatorname{pre} f(\pi)\right}$. The probability measure $\operatorname{Pr}^{D}$ is defined on the cylinder sets as $P r^{D}\left(C y l\left(s{0} \ldots s_{n}\right)\right)=\prod_{0 \leq i<n} \mathbf{P}\left(s_{i}, s_{i+1}\right)$.

Markov Decision Processes (MDPs) are a generalization of DTMCs that are used to model systems that have a combination of probabilistic and non-deterministic behaviour. An MDP is a tuple $M=(S, \hat{s}$, Act, $\delta, R, G)$ where $S, \hat{s}$ and $G$ are the same as for DTMCs, Act is a finite set of actions, $R: S \times A c t \times S \rightarrow \Re_{\geq 0}$ is a reward function, assigns to each transition a non-negative reward value and $\delta: S \times A c t \rightarrow \operatorname{Dist}(S)$ is a probabilistic transition function. For every state $s \in S$ of an MDP $M$ one or more actions of Act are defined as enabled actions. We use $A c t(s)$ for this set and define it as $\operatorname{Act}(s)={\alpha \in A c t \mid \delta(s, \alpha)$ is defined $}$.
For $s \in S$ and $\alpha \in \operatorname{Act}(s)$ we use Post $(s, \alpha)$ for the set of $\alpha$ successors of $s$, Post $(s)$ for all successors of $s$ and Pre $(s)$ for predecessors of $s[1]$ :
\begin{aligned} &\operatorname{Post}(s, \alpha) \doteq\left{s^{\prime} \in S \mid \delta\left(s, \alpha, s^{\prime}\right)>0\right} \ &\operatorname{Post}(s) \doteq \cup_{\alpha \in \operatorname{Act}(s)} \operatorname{Post}(s, \alpha), \end{aligned}
To evaluate the operational behaviour of an MDP $M$ we should consider two steps to take a transition from a state $s \in S$. First, one enabled action $\alpha \in \operatorname{Act}(s)$ is chosen non-deterministically. Second, according to the probability distribution $\delta(s, \alpha)$, a successor state $s^{\prime} \in \operatorname{Post}(s, \alpha)$ is selected randomly. In this case, $\delta(s, \alpha)\left(s^{\prime}\right)$ determines the probability of a transition from $s$ to $s^{\prime}$ by the action $\alpha \in \operatorname{Act}(s)$. We extend the definition of paths for MDPs: A path in an MDP $M$ is a non-empty (finite or infinite) sequence $\pi=s_{0} \stackrel{\alpha_{0}}{\rightarrow} s_{1} \stackrel{\alpha_{1}}{\rightarrow} \ldots$ where $s_{i} \in S$ and $\alpha_{i} \in \operatorname{Act}\left(s_{i}\right)$ and $s_{i+1} \in \operatorname{Post}\left(s_{i}, \alpha_{i}\right)$ for every $i \geq 0$. Similar to the case with DTMCs, for a state $s \in S$, we use Paths $_{M, s}$ to denote the set of all paths of $M$ starting in $s$ and $F$ Paths ${ }_{M, s}$ for all finite paths of it. For reasoning about the probabilistic behaviour of an MDP we use the notion of policy (also called adversary) $[2,3,6]$.

Definition 3. A (deterministic) policy of an MDP $M$ is a function $\sigma$ : FPath $_{M} \rightarrow$ Act that for every finite path $\pi=s_{0} \stackrel{\alpha_{0}}{\rightarrow} s_{1} \stackrel{\alpha_{i}}{\rightarrow} \ldots \stackrel{\alpha_{i-1}}{\rightarrow} s_{i}$ selects an enabled action $\alpha_{i} \in \operatorname{Act}\left(s_{i}\right)$. The policy $\sigma$ is memoryless if it depends only on the last state of the path. In general, a policy is defined as function of $F P a t h_{M}$ to a distribution on Act. However, memoryless and deterministic policies are enough for computing the optimal unbounded reachability probabilities [2]. We use $\mathrm{Pol}{M}$ for the set of all deterministic and memoryless policies of $M$. A policy $\sigma \in \operatorname{Pol}{M}$ resolves all non-deterministic choices in $M$ and induces a DTMC $M^{\sigma}$ for which every state is a finite path of $M$.

## 数学代写|理论计算机代写theoretical computer science代考|Probability Measure of a Markov Chain

\begin{aligned} &\operatorname{Post}(s, \alpha) \doteq\left{s^{\prime} \in S \mid \delta\left(s, \alpha, s^{\prime}\ right)>0\right} \ &\operatorname{Post}(s) \doteq \cup_{\alpha \in \operatorname{Act}(s)} \operatorname{Post}(s, \alpha), \end{对齐}\begin{aligned} &\operatorname{Post}(s, \alpha) \doteq\left{s^{\prime} \in S \mid \delta\left(s, \alpha, s^{\prime}\ right)>0\right} \ &\operatorname{Post}(s) \doteq \cup_{\alpha \in \operatorname{Act}(s)} \operatorname{Post}(s, \alpha), \end{对齐}

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。