mirror of
https://github.com/NotXia/unibo-ai-notes.git
synced 2025-12-14 18:51:52 +01:00
Compare commits
2 Commits
617fd5b7bd
...
2ad67a3625
| Author | SHA1 | Date | |
|---|---|---|---|
|
2ad67a3625
|
|||
|
5484e66406
|
@ -284,7 +284,7 @@
|
|||||||
\end{theorem}
|
\end{theorem}
|
||||||
|
|
||||||
\begin{remark}
|
\begin{remark}
|
||||||
By \Cref{th:lti_continuous}, row/column stochasticity is not required for consensus. Instead, the requirement is for the matrix to be Laplacian.
|
By \Cref{th:lti_continuous}, row/column stochasticity is not required for consensus. Instead, the requirement is for the matrix to be the Laplacian.
|
||||||
\end{remark}
|
\end{remark}
|
||||||
\end{description}
|
\end{description}
|
||||||
|
|
||||||
@ -314,7 +314,7 @@
|
|||||||
\end{lemma}
|
\end{lemma}
|
||||||
|
|
||||||
\begin{lemma} \phantomsection\label{th:connected_simple_eigenvalue}
|
\begin{lemma} \phantomsection\label{th:connected_simple_eigenvalue}
|
||||||
If a weighted digraph $G$ is strongly connected, then $\lambda = 0$ is a simple eigenvalue.
|
If a weighted digraph $G$ is strongly connected, then $\lambda = 0$ is a simple eigenvalue of $\matr{L}$.
|
||||||
\end{lemma}
|
\end{lemma}
|
||||||
|
|
||||||
\begin{theorem}[Continuous-time consensus] \marginnote{Continuous-time consensus}
|
\begin{theorem}[Continuous-time consensus] \marginnote{Continuous-time consensus}
|
||||||
|
|||||||
@ -3,7 +3,7 @@
|
|||||||
|
|
||||||
\begin{description}
|
\begin{description}
|
||||||
\item[Leader-follower network] \marginnote{Leader-follower network}
|
\item[Leader-follower network] \marginnote{Leader-follower network}
|
||||||
Consider agents partitioned into $N_f$ followers and $N-N_f$ leaders.
|
Consider $N$ agents partitioned into $N_f$ followers and $N-N_f$ leaders.
|
||||||
|
|
||||||
The state vector can be partitioned as:
|
The state vector can be partitioned as:
|
||||||
\[ \x = \begin{bmatrix} \x_f \\ \x_l \end{bmatrix} \]
|
\[ \x = \begin{bmatrix} \x_f \\ \x_l \end{bmatrix} \]
|
||||||
@ -15,7 +15,7 @@
|
|||||||
\]
|
\]
|
||||||
where $\lap_f$ is the followers' Laplacian, $\lap_l$ the leaders', and $\lap_{fl}$ is the part in common.
|
where $\lap_f$ is the followers' Laplacian, $\lap_l$ the leaders', and $\lap_{fl}$ is the part in common.
|
||||||
|
|
||||||
Assume that leaders and followers run the same Laplacian-based distributed control law (i.e., an normal averaging system), the system can be formulated as:
|
Assume that leaders and followers run the same Laplacian-based distributed control law (i.e., a normal averaging system), the system can be formulated as:
|
||||||
\[
|
\[
|
||||||
\begin{bmatrix} \dot{\x}_f(t) \\ \dot{\x}_l(t) \end{bmatrix} =
|
\begin{bmatrix} \dot{\x}_f(t) \\ \dot{\x}_l(t) \end{bmatrix} =
|
||||||
- \begin{bmatrix} \lap_f & \lap_{fl} \\ \lap_{fl}^T & \lap_l \end{bmatrix}
|
- \begin{bmatrix} \lap_f & \lap_{fl} \\ \lap_{fl}^T & \lap_l \end{bmatrix}
|
||||||
@ -30,7 +30,7 @@
|
|||||||
\begin{bmatrix}
|
\begin{bmatrix}
|
||||||
\dot{x}_1(t) \\ \dot{x}_2(t) \\ \dot{x}_3(t) \\ \dot{x}_4(t)
|
\dot{x}_1(t) \\ \dot{x}_2(t) \\ \dot{x}_3(t) \\ \dot{x}_4(t)
|
||||||
\end{bmatrix} =
|
\end{bmatrix} =
|
||||||
\begin{bmatrix}
|
- \begin{bmatrix}
|
||||||
\begin{tabular}{ccc|c}
|
\begin{tabular}{ccc|c}
|
||||||
1 & -1 & 0 & 0 \\
|
1 & -1 & 0 & 0 \\
|
||||||
-1 & 2 & -1 & 0 \\
|
-1 & 2 & -1 & 0 \\
|
||||||
@ -122,7 +122,7 @@
|
|||||||
\x_f^T \lap_f \x_f &\geq 0 & & \forall \x_f
|
\x_f^T \lap_f \x_f &\geq 0 & & \forall \x_f
|
||||||
\end{aligned}
|
\end{aligned}
|
||||||
\]
|
\]
|
||||||
\item The only case when $\x^T \lap \x = 0$ for $\x \neq 0$ is with $\x = \alpha\vec{1}$ for $\alpha \neq 0$. As $\forall \x_f: \bar{\x} \neq \alpha\vec{1}$, it holds that $\forall \x_f: \x_f^T \lap_f \x_f \neq 0$.
|
\item The only case when $\x^T \lap \x = 0$ for $\x \neq 0$ is with $\x = \alpha\vec{1}$ for $\alpha \neq 0$. As $\forall \x_f: \bar{\x} \neq \alpha\vec{1}$, it holds that $\forall \x_f \neq 0: \x_f^T \lap_f \x_f \neq 0$.
|
||||||
\end{enumerate}
|
\end{enumerate}
|
||||||
Therefore, $\lap_f$ is positive definite as $\forall \x_f \neq 0: \x_f^T \lap_f \x_f > 0$.
|
Therefore, $\lap_f$ is positive definite as $\forall \x_f \neq 0: \x_f^T \lap_f \x_f > 0$.
|
||||||
\end{proof}
|
\end{proof}
|
||||||
@ -182,7 +182,7 @@
|
|||||||
Therefore, we have that:
|
Therefore, we have that:
|
||||||
\[
|
\[
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
\left( \sum_{j=1}^N a_{ij} \right) x_{E,i} &= \sum_{j=1}^N a_{ij} x_{E,j} & & \forall i \in \{ 1, \dots, N_f \} \\
|
\left( \sum_{k=1}^N a_{ik} \right) x_{E,i} &= \sum_{j=1}^N a_{ij} x_{E,j} & & \forall i \in \{ 1, \dots, N_f \} \\
|
||||||
x_{E,i} &= \sum_{j=1}^N \frac{a_{ij}}{\sum_{k=1}^N a_{ik}} x_{E,j} & & \forall i \in \{ 1, \dots, N_f \} \\
|
x_{E,i} &= \sum_{j=1}^N \frac{a_{ij}}{\sum_{k=1}^N a_{ik}} x_{E,j} & & \forall i \in \{ 1, \dots, N_f \} \\
|
||||||
\end{aligned}
|
\end{aligned}
|
||||||
\]
|
\]
|
||||||
@ -211,7 +211,7 @@
|
|||||||
\end{description}
|
\end{description}
|
||||||
|
|
||||||
\begin{theorem}[Containment with non-static leaders non-equilibrium]
|
\begin{theorem}[Containment with non-static leaders non-equilibrium]
|
||||||
Naive containment with non-static leaders do not have an equilibrium.
|
Naive containment with non-static leaders does not have an equilibrium.
|
||||||
|
|
||||||
\begin{proof}
|
\begin{proof}
|
||||||
Ideally, the equilibria for followers' and leader's dynamics are:
|
Ideally, the equilibria for followers' and leader's dynamics are:
|
||||||
@ -235,7 +235,7 @@
|
|||||||
\end{split}
|
\end{split}
|
||||||
\]
|
\]
|
||||||
|
|
||||||
By inspecting the value of the containment error $\vec{e}(t)$ when it reaches equilibrium we have that:
|
By inspecting the value of the containment error $\vec{e}(t)$ when it reaches equilibrium, we have that:
|
||||||
\[
|
\[
|
||||||
\begin{split}
|
\begin{split}
|
||||||
0 &= \dot{\vec{e}}(t) \\
|
0 &= \dot{\vec{e}}(t) \\
|
||||||
@ -331,7 +331,7 @@
|
|||||||
|
|
||||||
\begin{description}
|
\begin{description}
|
||||||
\item[Containment with discrete-time] \marginnote{Containment with discrete-time}
|
\item[Containment with discrete-time] \marginnote{Containment with discrete-time}
|
||||||
Containment can be discretized using the forward-Eurler discretization. Its dynamics is defined as:
|
Containment can be discretized using the forward-Euler discretization. Its dynamics is defined as:
|
||||||
\[
|
\[
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
\dot{\x}_i(t) &= - \sum_{j \in \mathcal{N}_i} a_{ij} (x_i(t) - x_j(t)) & & \forall i \in \{1, \dots, N_f\} \\
|
\dot{\x}_i(t) &= - \sum_{j \in \mathcal{N}_i} a_{ij} (x_i(t) - x_j(t)) & & \forall i \in \{1, \dots, N_f\} \\
|
||||||
@ -373,5 +373,5 @@
|
|||||||
\[
|
\[
|
||||||
\dot{\x}(t) = - \lap \otimes \matr{I}_d \x(t)
|
\dot{\x}(t) = - \lap \otimes \matr{I}_d \x(t)
|
||||||
\]
|
\]
|
||||||
where $\otimes$ is the Kronecker product.
|
where $\otimes$ is the Kronecker product (i.e., apply the same matrix across each dimension).
|
||||||
\end{description}
|
\end{description}
|
||||||
@ -6,7 +6,7 @@
|
|||||||
Problem where $N$ agents want to optimize their positions $\z_i \in \mathbb{R}^2$ to perform multi-robot surveillance in an environment with:
|
Problem where $N$ agents want to optimize their positions $\z_i \in \mathbb{R}^2$ to perform multi-robot surveillance in an environment with:
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item A static target to protect $\r_0 \in \mathbb{R}^2$.
|
\item A static target to protect $\r_0 \in \mathbb{R}^2$.
|
||||||
\item Static intruders/opponents $\r_i \in \mathbb{R}^2$, each assigned to an agent $i$.
|
\item Static intruders/opponents $\r_i \in \mathbb{R}^2$, each assigned to the respective agent $i$.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
|
||||||
The average position of the agents define the barycenter:
|
The average position of the agents define the barycenter:
|
||||||
@ -16,7 +16,7 @@
|
|||||||
\[
|
\[
|
||||||
l_i(\z_i, \sigma(\z)) =
|
l_i(\z_i, \sigma(\z)) =
|
||||||
\gamma_i \underbrace{\Vert \z_i - \r_i \Vert^2}_{\text{close to opponent}} +
|
\gamma_i \underbrace{\Vert \z_i - \r_i \Vert^2}_{\text{close to opponent}} +
|
||||||
\underbrace{\Vert \sigma(\z) - \r_0 \Vert^2}_{\text{barycenter close to protectee}}
|
\underbrace{\Vert \sigma(\z) - \r_0 \Vert^2}_{\text{barycenter close to target}}
|
||||||
\]
|
\]
|
||||||
Note that the opponent component only depends on local variables while the target component needs global information.
|
Note that the opponent component only depends on local variables while the target component needs global information.
|
||||||
|
|
||||||
@ -69,8 +69,9 @@
|
|||||||
&\frac{\partial}{\partial z_i} \left.\left( \sum_{j=1}^{N} l_j(z_j, \sigma(z_1, \dots, z_N)) \right) \right|_{z_j=z_j^k} \\
|
&\frac{\partial}{\partial z_i} \left.\left( \sum_{j=1}^{N} l_j(z_j, \sigma(z_1, \dots, z_N)) \right) \right|_{z_j=z_j^k} \\
|
||||||
&=
|
&=
|
||||||
\left.\frac{\partial}{\partial z_i} l_i(z_i, \sigma) \right|_{\substack{z_i = z_i^k,\\\sigma = \sigma(\z^k)}} +
|
\left.\frac{\partial}{\partial z_i} l_i(z_i, \sigma) \right|_{\substack{z_i = z_i^k,\\\sigma = \sigma(\z^k)}} +
|
||||||
\left.\left(\sum_{j=1}^{N} \frac{\partial}{\partial \sigma} l_j(z_j, \sigma) \right)\right|_{\substack{z_j = z_j^k,\\\sigma = \sigma(\z^k)}} \cdot
|
\sum_{j=1}^{N} \left( \left. \left( \frac{\partial}{\partial \sigma} l_j(z_j, \sigma) \right)\right|_{\substack{z_j = z_j^k,\\\sigma = \sigma(\z^k)}}
|
||||||
\left.\frac{\partial}{\partial z_i} \sigma(z_1, \dots, z_N)\right|_{\substack{z_j=z_j^k}}
|
\cdot
|
||||||
|
\left.\frac{\partial}{\partial z_i} \sigma(z_1, \dots, z_N)\right|_{\substack{z_j=z_j^k}} \right)
|
||||||
\end{split}
|
\end{split}
|
||||||
\]
|
\]
|
||||||
|
|
||||||
|
|||||||
@ -16,7 +16,7 @@
|
|||||||
\[
|
\[
|
||||||
F_{e,i}(x) = -a_{i,i-1}(x_i-x_{i-1}) - a_{i,i+1}(x_i - x_{i+1})
|
F_{e,i}(x) = -a_{i,i-1}(x_i-x_{i-1}) - a_{i,i+1}(x_i - x_{i+1})
|
||||||
\]
|
\]
|
||||||
Equivalently, it is possible to express the elastic force as the negative gradient of the elastic energy:
|
Equivalently, it is possible to express the elastic force as the negative gradient of the elastic potential energy:
|
||||||
\[
|
\[
|
||||||
F_{e,i}(x) = -\frac{\partial}{\partial x_i}\left( \frac{1}{2} a_{i,i-1} \Vert x_i - x_{i-1} \Vert^2 + \frac{1}{2} a_{i,i+1} \Vert x_i - x_{i+1} \Vert^2 \right)
|
F_{e,i}(x) = -\frac{\partial}{\partial x_i}\left( \frac{1}{2} a_{i,i-1} \Vert x_i - x_{i-1} \Vert^2 + \frac{1}{2} a_{i,i+1} \Vert x_i - x_{i+1} \Vert^2 \right)
|
||||||
\]
|
\]
|
||||||
@ -86,7 +86,7 @@
|
|||||||
\end{figure}
|
\end{figure}
|
||||||
\end{remark}
|
\end{remark}
|
||||||
|
|
||||||
By adding a damping coefficient (i.e., dispersion of velocity) $c=1$, the overall system dynamics can be defined as:
|
By adding a constant damping coefficient (i.e., dispersion of velocity) $c=1$, the overall system dynamics can be defined as:
|
||||||
\[
|
\[
|
||||||
\begin{split}
|
\begin{split}
|
||||||
\dot{x}_i &= v_i \\
|
\dot{x}_i &= v_i \\
|
||||||
@ -148,7 +148,7 @@
|
|||||||
|
|
||||||
\begin{description}
|
\begin{description}
|
||||||
\item[Formation control] \marginnote{Formation control}
|
\item[Formation control] \marginnote{Formation control}
|
||||||
Consider $N$ agents with states $\x_i(t) \in \mathbb{R}^d$ and communicating according to a fixed undirected graph $G$, and a set of distances $d_{ij} = d_{ji}$. The goal is to position each agent respecting the desired distances between them:
|
Consider $N$ agents with states $\x_i(t) \in \mathbb{R}^d$ and communicating according to a fixed undirected graph $G$. The goal is to position each agent respecting the desired distances $d_{ij} = d_{ji}$ between them:
|
||||||
\[
|
\[
|
||||||
\forall (i,j) \in E: \Vert \x_i^\text{form} - \x_j^\text{form} \Vert = d_{ij}
|
\forall (i,j) \in E: \Vert \x_i^\text{form} - \x_j^\text{form} \Vert = d_{ij}
|
||||||
\]
|
\]
|
||||||
|
|||||||
@ -307,8 +307,9 @@
|
|||||||
\[
|
\[
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
V(\tilde{\z}^{k+1}) - V(\tilde{\z}^k) &= \Vert \tilde{\z}^{k+1} \Vert^2 - \Vert \tilde{\z}^k \Vert^2 \\
|
V(\tilde{\z}^{k+1}) - V(\tilde{\z}^k) &= \Vert \tilde{\z}^{k+1} \Vert^2 - \Vert \tilde{\z}^k \Vert^2 \\
|
||||||
&= \cancel{\Vert \tilde{\z}^k \Vert^2} - 2\alpha(\vec{u}^k)^T\tilde{\z}^k + \alpha^2 \Vert \vec{u}^k \Vert^2 - \cancel{\Vert \tilde{\z}^k \Vert^2} &&& \text{\Cref{th:strong_convex_lipschitz_gradient}} \\
|
&= \Vert \tilde{\z}^{k} - \alpha \vec{u}^{k} \Vert^2 - \Vert \tilde{\z}^k \Vert^2 \\
|
||||||
&\leq -2\alpha\gamma_1 \Vert\tilde{\z}^k\Vert^2 + \alpha(\alpha-2\gamma_2) \Vert\vec{u}^k\Vert^2
|
&= \cancel{\Vert \tilde{\z}^k \Vert^2} - 2\alpha(\vec{u}^k)^T\tilde{\z}^k + \alpha^2 \Vert \vec{u}^k \Vert^2 - \cancel{\Vert \tilde{\z}^k \Vert^2} \\
|
||||||
|
&\leq -2\alpha\gamma_1 \Vert\tilde{\z}^k\Vert^2 + \alpha(\alpha-2\gamma_2) \Vert\vec{u}^k\Vert^2 &&& \text{\Cref{th:strong_convex_lipschitz_gradient}}
|
||||||
\end{aligned}
|
\end{aligned}
|
||||||
\]
|
\]
|
||||||
|
|
||||||
@ -344,7 +345,7 @@
|
|||||||
\[
|
\[
|
||||||
\min_{\z} \frac{1}{2}\z^T \matr{Q} \z + \vec{r}^T \z
|
\min_{\z} \frac{1}{2}\z^T \matr{Q} \z + \vec{r}^T \z
|
||||||
\qquad
|
\qquad
|
||||||
\nabla l = \matr{Q} \z^k + \vec{r}
|
\nabla l = \matr{Q} \z + \vec{r}
|
||||||
\]
|
\]
|
||||||
The gradient method can be reduced to an affine linear system:
|
The gradient method can be reduced to an affine linear system:
|
||||||
\[
|
\[
|
||||||
|
|||||||
@ -87,4 +87,237 @@
|
|||||||
\begin{description}
|
\begin{description}
|
||||||
\item[Training data]
|
\item[Training data]
|
||||||
Manually annotated terms of service.
|
Manually annotated terms of service.
|
||||||
|
|
||||||
|
\item[Tasks] Two tasks are solved:
|
||||||
|
\begin{description}
|
||||||
|
\item[Detection] Binary classification problem aimed at determining whether a sentence contains a potentially unfair clause.
|
||||||
|
\item[Sentence classification] Classification problem of determining the category of the unfair clause.
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
\item[Experimental setup]
|
||||||
|
Leave-one-out where one document is used as test set and the remaining as train ($\frac{4}{5}$) and validation ($\frac{1}{5}$) set.
|
||||||
|
|
||||||
|
\item[Metrics] Precision, recall, F1.
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Base clause classifier}
|
||||||
|
|
||||||
|
Experimented methods were:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Bag-of-words,
|
||||||
|
\item Tree kernels,
|
||||||
|
\item CNN,
|
||||||
|
\item SVM,
|
||||||
|
\item \dots
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Background knowledge injection}
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[Memory-augmented neural network] \marginnote{Memory-augmented neural network}
|
||||||
|
Model that, given a query, retrieves some knowledge from the memory and combines them to produce the prediction.
|
||||||
|
|
||||||
|
In CLAUDETTE, the knowledge base is composed of all the possible rationales for which a clause can be unfair. The workflow is the following:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item The clause is used to query the knowledge base using a similarity score and the most relevant rationale is extracted.
|
||||||
|
\item The rationale is combined with the query.
|
||||||
|
\item Repeat the extraction step until the similarity score is too low.
|
||||||
|
\item Make the prediction and provide the rationales used as explanation.
|
||||||
|
\end{enumerate}
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
\begin{example}[Knowledge base for liability exclusion]
|
||||||
|
Rationales are divided into six class of clauses:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Kind of damage,
|
||||||
|
\item Standard of care,
|
||||||
|
\item Cause,
|
||||||
|
\item Causal link,
|
||||||
|
\item Liability theory,
|
||||||
|
\item Compensation amount.
|
||||||
|
\end{itemize}
|
||||||
|
\end{example}
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Multilingualism}
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[Training data]
|
||||||
|
Same terms of service of the original CLAUDETTE corpus selected according to the following criteria:
|
||||||
|
\begin{itemize}
|
||||||
|
\item The ToS is available in the target language,
|
||||||
|
\item There is a correspondence in terms of version or publication date between the documents in the two languages,
|
||||||
|
\item There are structure similarities between the documents in the two languages.
|
||||||
|
\end{itemize}
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[Approaches] Different strategies have been experimented with:
|
||||||
|
\begin{description}
|
||||||
|
\item[Novel corpus for target language] \marginnote{Novel corpus for target language}
|
||||||
|
Retrain CLAUDETTE from scratch with newly annotated data in the target language.
|
||||||
|
|
||||||
|
\item[Semi-automated creation of corpus through projection] \marginnote{Semi-automated creation of corpus through projection}
|
||||||
|
Method that works as follows:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item Use machine translation to translate the annotated English document in the target language while projecting the unfair clauses.
|
||||||
|
\item Match the machine translated document with the original document in the target language and project the unfair clauses (through human annotation).
|
||||||
|
\item Train CLAUDETTE from scratch.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
\item[Training set translation] \marginnote{Training set translation}
|
||||||
|
Translate the original document to the target language and train CLAUDETTE from scratch.
|
||||||
|
|
||||||
|
\begin{remark}
|
||||||
|
This method does not require human annotation.
|
||||||
|
\end{remark}
|
||||||
|
|
||||||
|
\item[Machine translation of queries] \marginnote{Machine translation of queries}
|
||||||
|
Method that works as follows:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item Translate the document from the target language to English.
|
||||||
|
\item Feed the translated document to CLAUDETTE.
|
||||||
|
\item Translate the English document back to the target language.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
\begin{remark}
|
||||||
|
This method does not require retraining.
|
||||||
|
\end{remark}
|
||||||
|
\end{description}
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\section{CLAUDETTE and GDPR}
|
||||||
|
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[CLAUDETTE for GDPR compliance]
|
||||||
|
To integrate CLAUDETTE as a tool to check GDPR compliance, three dimensions, each containing different categories (ranked with three levels of achievement), are checked:
|
||||||
|
\begin{descriptionlist}
|
||||||
|
\item[Comprehensiveness of information] \marginnote{Comprehensiveness of information}
|
||||||
|
Whether the policy contains all the information required by articles 13 and 14 of the GDPR.
|
||||||
|
|
||||||
|
Categories of this dimension comprises:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Contact information of the controller,
|
||||||
|
\item Contact information of the data protection officer,
|
||||||
|
\item Purpose and legal bases for processing,
|
||||||
|
\item Category of personal data processed,
|
||||||
|
\item \dots
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\item[Substantive compliance] \marginnote{Substantive compliance}
|
||||||
|
Whether the policy processes personal data complying with the GDPR.
|
||||||
|
|
||||||
|
Categories of this dimension comprises:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Processing of sensitive data,
|
||||||
|
\item Processing of children's data,
|
||||||
|
\item Consent by using, take-or-leave,
|
||||||
|
\item Transfer to third parties or countries,
|
||||||
|
\item Policy change (e.g., if the data subject is notified),
|
||||||
|
\item Licensing data,
|
||||||
|
\item Advertising.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\item[Clarity of expression] \marginnote{Clarity of expression}
|
||||||
|
Whether the policy is precise and understandable (i.e., transparent).
|
||||||
|
|
||||||
|
Categories of this dimension comprises:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Conditional terms: the performance of an action is dependent on a variable trigger.
|
||||||
|
\begin{remark}
|
||||||
|
Typical language qualifiers to identify this category are: depending, as necessary, as appropriate, as needed, otherwise reasonably, sometimes, from time to time, \dots
|
||||||
|
\end{remark}
|
||||||
|
\begin{example}
|
||||||
|
``\textit{We also may share your information if we believe, in our sole discretion, that such disclosure is \underline{necessary} \textnormal{\dots}}''
|
||||||
|
\end{example}
|
||||||
|
|
||||||
|
\item Generalization: terms to abstract practices with an unclear context.
|
||||||
|
\begin{remark}
|
||||||
|
Typical language qualifiers to identify this category are: generally, mostly, widely, general, commonly, usually, normally, typically, largely, often, primarily, among other things, \dots
|
||||||
|
\end{remark}
|
||||||
|
\begin{example}
|
||||||
|
``\textit{We \underline{typically} or \underline{generally} collect information \dots When you use an Application on a Device, we will collect and use information about you in \underline{generally} similar ways and for similar purposes as when you use the TripAdvisor website.}''
|
||||||
|
\end{example}
|
||||||
|
|
||||||
|
\item Modality: terms that ambiguously refer to the possibility of actions or events.
|
||||||
|
\begin{remark}
|
||||||
|
Typical language qualifiers to identify this category are: may, might, could, would, possible, possibly, \dots
|
||||||
|
|
||||||
|
Note that these qualifiers have two possible meanings: possibility and permission. This category only deals with possibility.
|
||||||
|
\end{remark}
|
||||||
|
\begin{example}
|
||||||
|
``\textit{We \underline{may} use your personal data to develop new services.}''
|
||||||
|
\end{example}
|
||||||
|
|
||||||
|
\item Non-specific numeric quantifiers: terms that are ambiguous in terms of actual measure.
|
||||||
|
\begin{remark}
|
||||||
|
Typical language qualifiers to identify this category are: certain, numerous, some, most, many, various, including (but not limited to), variety, \dots
|
||||||
|
\end{remark}
|
||||||
|
\begin{example}
|
||||||
|
``\textit{\textnormal{\dots}we may collect a \underline{variety} of information, \underline{including} your name, mailing address, phone number, email address, \dots}''
|
||||||
|
\end{example}
|
||||||
|
\end{itemize}
|
||||||
|
\end{descriptionlist}
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
\section{LLMs and privacy policies}
|
||||||
|
|
||||||
|
\begin{remark}
|
||||||
|
The GDPR requires two competing properties for privacy policies:
|
||||||
|
\begin{descriptionlist}
|
||||||
|
\item[Comprehensiveness] The policy should contain all the relevant information.
|
||||||
|
\item[Comprehensibility] The policy should be easily understandable.
|
||||||
|
\end{descriptionlist}
|
||||||
|
\end{remark}
|
||||||
|
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[Comprehensive policy from LLMs]
|
||||||
|
Formulate privacy policies for comprehensiveness and let LLMs extract the relevant information.
|
||||||
|
|
||||||
|
A template for a comprehensive policy could include:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Categories of personal data collected,
|
||||||
|
\item Purpose each category of data is processed for,
|
||||||
|
\item Legal basis for processing each category,
|
||||||
|
\item Storage period or deletion criteria,
|
||||||
|
\item Recipients or categories of recipients the data is shared with, their role, the purpose of sharing, and the legal basis.
|
||||||
|
\end{itemize}
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[Experimental setup]
|
||||||
|
The following questions were defined to assess a privacy policy:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item What data does the company process about me?
|
||||||
|
\item For what purposes does the company use my email address?
|
||||||
|
\item Who does the company share my geolocation with?
|
||||||
|
\item What types of data are processed on the basis of consent, and for what purposes?
|
||||||
|
\item What data does the company share with Facebook?
|
||||||
|
\item Does the company share my data with insurers?
|
||||||
|
\item What categories of data does the company collect about me automatically?
|
||||||
|
\item How can I contact the company if I want to exercise my rights?
|
||||||
|
\item How long does the company keep my delivery address?
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
Three scenarios were considered:
|
||||||
|
\begin{itemize}
|
||||||
|
\item Human evaluation of the questions on existing privacy policies,
|
||||||
|
\item LLMs to answer the questions on ideal mock policies (with human evaluation).
|
||||||
|
\item LLMs to answer the questions on real policies (with human evaluation).
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Results show that:
|
||||||
|
\begin{itemize}
|
||||||
|
\item LLMs have high performance on the mock policies.
|
||||||
|
\item LLMs and humans struggle to answer the questions on real privacy policies.
|
||||||
|
\end{itemize}
|
||||||
\end{description}
|
\end{description}
|
||||||
Reference in New Issue
Block a user