mirror of
https://github.com/NotXia/unibo-ai-notes.git
synced 2025-12-15 02:52:22 +01:00
Fix typos <noupdate>
This commit is contained in:
@ -48,7 +48,7 @@ At level $i$, a value is assigned to the variable $X_i$ and
|
||||
constraints involving $X_1, \dots, X_i$ are checked.
|
||||
In case of failure, the path is not further explored.
|
||||
|
||||
A problem of this approach is that it requires to backtrack in case of failure
|
||||
A problem with this approach is that it requires to backtrack in case of failure
|
||||
and reassign all the variables in the worst case.
|
||||
|
||||
|
||||
@ -81,7 +81,7 @@ If the domain of a variable becomes empty, the path is considered a failure and
|
||||
\item $X_1 = 2 \hspace{1cm} X_2 :: [\cancel{1}, \cancel{2}, 3] \hspace{1cm} X_3 :: [\cancel{1}, \cancel{2}, 3]$
|
||||
\item $X_1 = 2 \hspace{1cm} X_2 = 3 \hspace{1cm} X_3 :: [\cancel{1}, \cancel{2}, \cancel{3}]$
|
||||
\end{enumerate}
|
||||
As the domain of $X_3$ is empty, search on this branch fails and backtracking is required.
|
||||
As the domain of $X_3$ is empty, a search on this branch fails and backtracking is required.
|
||||
\end{example}
|
||||
|
||||
|
||||
@ -103,7 +103,7 @@ If the domain of a variable becomes empty, the path is considered a failure and
|
||||
Consider the variables and constraints:
|
||||
\[ X_1 :: [1, 2, 3] \hspace{0.5cm} X_2 :: [1, 2, 3] \hspace{0.5cm} X_3 :: [1, 2, 3] \hspace{1cm} X_1 < X_2 < X_3 \]
|
||||
|
||||
We assign the variables in lexicographic order. At each step we have that:
|
||||
We assign the variables in lexicographic order. At each step, we have that:
|
||||
\begin{enumerate}
|
||||
\item $X_1 = 1 \hspace{1cm} X_2 :: [\cancel{1}, 2, \cancel{3}] \hspace{1cm} X_3 :: [\cancel{1}, 2, 3]$ \\
|
||||
Here, we assign $X_1=1$ and propagate to unassigned constraints.
|
||||
@ -120,7 +120,7 @@ If the domain of a variable becomes empty, the path is considered a failure and
|
||||
Consider the variables and constraints:
|
||||
\[ X_1 :: [1, 2, 3] \hspace{0.5cm} X_2 :: [1, 2, 3] \hspace{0.5cm} X_3 :: [1, 2, 3] \hspace{1cm} X_1 < X_2 < X_3 \]
|
||||
|
||||
We assign the variables in lexicographic order. At each step we have that:
|
||||
We assign the variables in lexicographic order. At each step, we have that:
|
||||
\begin{enumerate}
|
||||
\item $X_1 = 1 \hspace{1cm} X_2 :: [\cancel{1}, 2, \cancel{3}] \hspace{1cm} X_3 :: [\cancel{1}, \cancel{2}, 3]$ \\
|
||||
Here, we assign $X_1=1$ and propagate to unassigned constraints.
|
||||
@ -252,5 +252,5 @@ This class of methods can be applied statically before the search or after each
|
||||
Generalization of arc/path consistency.
|
||||
If a problem with $n$ variables is $n$-consistent, the solution can be found without search.
|
||||
|
||||
Usually it is not applicable as it has exponential complexity.
|
||||
Usually, it is not applicable as it has exponential complexity.
|
||||
\end{description}
|
||||
@ -45,7 +45,7 @@ an iteration of the minimax algorithm can be described as follows:
|
||||
|
||||
\item[Propagation]
|
||||
Starting from the parents of the leaves, the scores are propagated upwards
|
||||
by labeling the parents based on the children's score.
|
||||
by labeling the parents based on the children's scores.
|
||||
|
||||
Given an unlabeled node $m$, if $m$ is at a \textsc{Max} level, its label is the maximum of its children's score.
|
||||
Otherwise (\textsc{Min} level), the label is the minimum of its children's score.
|
||||
@ -90,7 +90,7 @@ an iteration of the minimax algorithm can be described as follows:
|
||||
|
||||
\section{Alpha-beta cuts}
|
||||
\marginnote{Alpha-beta cuts}
|
||||
Alpha-beta cuts (pruning) allows to prune subtrees whose state will never be selected (when playing optimally).
|
||||
Alpha-beta pruning (cuts) allows to prune subtrees whose state will never be selected (when playing optimally).
|
||||
$\alpha$ represents the best choice found for \textsc{Max}.
|
||||
$\beta$ represents the best choice found for \textsc{Min}.
|
||||
|
||||
|
||||
@ -32,7 +32,7 @@ Intelligence is defined as the ability to perceive or infer information and to r
|
||||
\item[Symbolic AI (top-down)] \marginnote{Symbolic AI}
|
||||
Symbolic representation of knowledge, understandable by humans.
|
||||
|
||||
\item[Connectionist approach (bottom up)] \marginnote{Connectionist approach}
|
||||
\item[Connectionist approach (bottom-up)] \marginnote{Connectionist approach}
|
||||
Neural networks. Knowledge is encoded and not understandable by humans.
|
||||
\end{description}
|
||||
|
||||
@ -124,7 +124,7 @@ A \textbf{feed-forward neural network} is composed of multiple layers of neurons
|
||||
The first layer is the input layer, while the last is the output layer.
|
||||
Intermediate layers are hidden layers.
|
||||
|
||||
The expressivity of a neural networks increases when more neurons are used:
|
||||
The expressivity of a neural network increases when more neurons are used:
|
||||
\begin{descriptionlist}
|
||||
\item[Single perceptron]
|
||||
Able to compute a linear separation.
|
||||
@ -158,7 +158,7 @@ The expressivity of a neural networks increases when more neurons are used:
|
||||
\item[Deep learning] \marginnote{Deep learning}
|
||||
Neural network with a large number of layers and neurons.
|
||||
The learning process is hierarchical: the network exploits simple features in the first layers and
|
||||
synthesis more complex concepts while advancing through the layers.
|
||||
synthesizes more complex concepts while advancing through the layers.
|
||||
\end{description}
|
||||
|
||||
|
||||
|
||||
@ -12,9 +12,9 @@
|
||||
In other words, for each $s \in \mathcal{S}$, $\mathcal{N}(s) \subseteq \mathcal{S}$.
|
||||
|
||||
\begin{example}[Travelling salesman problem]
|
||||
Problem: find an Hamiltonian tour of minimum cost in an undirected graph.
|
||||
Problem: find a Hamiltonian tour of minimum cost in an undirected graph.
|
||||
|
||||
A possible neighborhood of a state applies the $k$-exchange that guarantees to maintain an Hamiltonian tour.
|
||||
A possible neighborhood of a state applies the $k$-exchange that guarantees to maintain a Hamiltonian tour.
|
||||
\begin{figure}[ht]
|
||||
\begin{subfigure}{.5\textwidth}
|
||||
\centering
|
||||
@ -37,7 +37,7 @@
|
||||
|
||||
\item[Global optima]
|
||||
Given an evaluation function $f$,
|
||||
a global optima (maximization case) is a state $s_\text{opt}$ such that:
|
||||
a global optimum (maximization case) is a state $s_\text{opt}$ such that:
|
||||
\[ \forall s \in \mathcal{S}: f(s_\text{opt}) \geq f(s) \]
|
||||
|
||||
Note: a larger neighborhood usually allows to obtain better solutions.
|
||||
@ -55,7 +55,7 @@
|
||||
\marginnote{Iterative improvement (hill climbing)}
|
||||
Algorithm that only performs moves that improve the current solution.
|
||||
|
||||
It does not keep track of the explored states (i.e. may return in a previously visited state) and
|
||||
It does not keep track of the explored states (i.e. may return to a previously visited state) and
|
||||
stops after reaching a local optima.
|
||||
|
||||
\begin{algorithm}
|
||||
@ -169,7 +169,7 @@ moves can be stored instead but, with this approach, some still not visited solu
|
||||
\marginnote{Iterated local search}
|
||||
Based on two steps:
|
||||
\begin{descriptionlist}
|
||||
\item[Subsidiary local search steps] Efficiently reach a local optima (intensification).
|
||||
\item[Subsidiary local search steps] Efficiently reach a local optimum (intensification).
|
||||
\item[Perturbation steps] Escape from a local optima (diversification).
|
||||
\end{descriptionlist}
|
||||
In addition, an acceptance criterion controls the two steps.
|
||||
@ -194,7 +194,7 @@ Population based meta heuristics are built on the following concepts:
|
||||
\begin{descriptionlist}
|
||||
\item[Adaptation] Organisms are suited to their environment.
|
||||
\item[Inheritance] Offspring resemble their parents.
|
||||
\item[Natural selection] Fit organisms have many offspring, others become extinct.
|
||||
\item[Natural selection] Fit organisms have many offspring while others become extinct.
|
||||
\end{descriptionlist}
|
||||
|
||||
\begin{table}[ht]
|
||||
@ -244,10 +244,10 @@ Genetic operators are:
|
||||
\includegraphics[width=0.2\textwidth]{img/_genetic_mutation.pdf}
|
||||
\end{center}
|
||||
\item[Proportional selection]
|
||||
Probability of a individual to be chosen as parent of the next offspring.
|
||||
Probability of an individual to be chosen as parent of the next offspring.
|
||||
Depends on the fitness.
|
||||
\item[Generational replacement]
|
||||
Create the new generation. Possibile approaches are:
|
||||
Create the new generation. Possible approaches are:
|
||||
\begin{itemize}
|
||||
\item Completely replace the old generation with the new one.
|
||||
\item Keep the best $n$ individual from the new and old population.
|
||||
|
||||
@ -124,7 +124,7 @@ The direction of the search can be:
|
||||
|
||||
\subsection{Deductive planning}
|
||||
\marginnote{Deductive planning}
|
||||
Formulates the planning problem using first order logic to represent states, goals and actions.
|
||||
Formulates the planning problem using first-order logic to represent states, goals and actions.
|
||||
Plans are generated as theorem proofs.
|
||||
|
||||
\subsubsection{Green's formulation}
|
||||
@ -163,7 +163,7 @@ The main concepts are:
|
||||
\end{example}
|
||||
|
||||
\item[Frame axioms]
|
||||
Besides the effects of actions, each state also have to define for all non-changing fluents their frame axioms.
|
||||
Besides the effects of actions, each state also has to define for all non-changing fluents their frame axioms.
|
||||
If the problem is complex, the number of frame axioms becomes unreasonable.
|
||||
\begin{example}[Moving blocks]
|
||||
\[ \texttt{on(U, V, S)} \land \texttt{diff(U, X)} \rightarrow \texttt{on(U, V, do(MOVE(X, Y, Z), S))} \]
|
||||
@ -247,7 +247,7 @@ Kowalsky's formulation avoids the frame axioms problem by using a set of fixed p
|
||||
Actions can be described as:
|
||||
\[ \texttt{poss(S)} \land \texttt{pact(A, S)} \rightarrow \texttt{poss(do(A, S))} \]
|
||||
|
||||
In the Kowalsky's formulation, each action requires a frame assertion (in Green's formulation, each state requires frame axioms).
|
||||
In Kowalsky's formulation, each action requires a frame assertion (in Green's formulation, each state requires frame axioms).
|
||||
|
||||
\begin{example}[Moving blocks]
|
||||
An initial state can be described by the following axioms:\\[0.5em]
|
||||
@ -381,7 +381,7 @@ def strips(problem):
|
||||
Since there are non-deterministic choices, the search space might become very large.
|
||||
Heuristics can be used to avoid this.
|
||||
|
||||
Conjunction of goals are solved separately, but this can lead to the \marginnote{Sussman anomaly} \textbf{Sussman anomaly}
|
||||
Conjunction of goals is solved separately, but this can lead to the \marginnote{Sussman anomaly} \textbf{Sussman anomaly}
|
||||
where a sub-goal destroys what another sub-goal has done.
|
||||
For this reason, when a conjunction is encountered, it is not immediately popped from the goal stack
|
||||
and is left as a final check.
|
||||
@ -698,7 +698,7 @@ In macro-operators, two types of operators are defined:
|
||||
\item[Macro] Set of atomic operators. Before execution, this type of operator has to be decomposed.
|
||||
\begin{description}
|
||||
\item[Precompiled decomposition]
|
||||
The decomposition is known and described along side the preconditions and effects of the operator.
|
||||
The decomposition is known and described alongside the preconditions and effects of the operator.
|
||||
\item[Planned decomposition]
|
||||
The planner has to synthesize the atomic operators that compose a macro operator.
|
||||
\end{description}
|
||||
@ -708,7 +708,7 @@ In macro-operators, two types of operators are defined:
|
||||
\begin{itemize}
|
||||
\item $X$ must be the effect of at least an atomic action in $P$ and should be protected until the end of $P$.
|
||||
\item Each precondition of the actions in $P$ must be guaranteed by previous actions or be a precondition of $A$.
|
||||
\item $P$ must not threat any causal link.
|
||||
\item $P$ must not threaten any causal link.
|
||||
\end{itemize}
|
||||
|
||||
Moreover, when a macro action $A$ is replaced with its decomposition $P$:
|
||||
@ -747,7 +747,7 @@ def hdpop(initial_state, goal, actions, decomposition_methods):
|
||||
|
||||
\section{Conditional planning}
|
||||
\marginnote{Conditional planning}
|
||||
Conditional planning is based on the open world assumption where what is not in the initial state is unknown.
|
||||
Conditional planning is based on the open-world assumption where what is not in the initial state is unknown.
|
||||
It generates a different plan for each source of uncertainty and therefore has exponential complexity.
|
||||
|
||||
\begin{description}
|
||||
@ -768,7 +768,7 @@ It generates a different plan for each source of uncertainty and therefore has e
|
||||
|
||||
|
||||
\section{Reactive planning}
|
||||
Reactive planners are on-line algorithms able to interact with the dynamicity the world.
|
||||
Reactive planners are online algorithms able to interact with the dynamicity of the world.
|
||||
|
||||
\subsection{Pure reactive systems}
|
||||
\marginnote{Pure reactive systems}
|
||||
@ -777,7 +777,7 @@ The choice of the action is predictable. Therefore, this approach is not suited
|
||||
|
||||
\subsection{Hybrid systems}
|
||||
\marginnote{Hybrid systems}
|
||||
Hybrid planners integrate the generative and reactive approach.
|
||||
Hybrid planners integrate the generative and reactive approaches.
|
||||
The steps the algorithm does are:
|
||||
\begin{itemize}
|
||||
\item Generates a plan to achieve the goal.
|
||||
|
||||
@ -76,7 +76,7 @@ def expand(node, problem):
|
||||
\subsection{Strategies}
|
||||
\begin{description}
|
||||
\item[Non-informed strategy] \marginnote{Non-informed strategy}
|
||||
Domain knowledge not available. Usually does an exhaustive search.
|
||||
Domain knowledge is not available. Usually does an exhaustive search.
|
||||
|
||||
\item[Informed strategy] \marginnote{Informed strategy}
|
||||
Use domain knowledge by using heuristics.
|
||||
@ -112,7 +112,7 @@ Always expands the least deep node. The fringe is implemented as a queue (FIFO).
|
||||
\hline
|
||||
\textbf{Completeness} & Yes \\
|
||||
\hline
|
||||
\textbf{Optimality} & Only with uniform cost (i.e. all edges have same cost) \\
|
||||
\textbf{Optimality} & Only with uniform cost (i.e. all edges have the same cost) \\
|
||||
\hline
|
||||
\textbf{\makecell{Time and space\\complexity}}
|
||||
& $O(b^d)$, where the solution depth is $d$ and the branching factor is $b$ (i.e. each non-leaf node has $b$ children) \\
|
||||
@ -238,7 +238,7 @@ estimate the effort needed to reach the final goal.
|
||||
\subsection{Best-first search}
|
||||
\marginnote{Best-first seacrh}
|
||||
Uses heuristics to compute the desirability of the nodes (i.e. how close they are to the goal).
|
||||
The fringe is ordered according the estimated scores.
|
||||
The fringe is ordered according to the estimated scores.
|
||||
|
||||
|
||||
\begin{description}
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
|
||||
\begin{description}
|
||||
\item[Swarm intelligence] \marginnote{Swarm intelligence}
|
||||
Group of locally-interacting agents that
|
||||
Group of locally interacting agents that
|
||||
shows an emergent behavior without a centralized control system.
|
||||
|
||||
A swarm intelligent system has the following features:
|
||||
@ -15,7 +15,7 @@
|
||||
\item The system adapts to changes.
|
||||
\end{itemize}
|
||||
|
||||
Agents interact between each other and obtain positive and negative feedbacks.
|
||||
Agents interact with each other and obtain positive and negative feedbacks.
|
||||
|
||||
\item[Stigmergy] \marginnote{Stigmergy}
|
||||
Form of indirect communication where an agent modifies the environment and the others react to it.
|
||||
@ -45,7 +45,7 @@ They also tend to prefer paths marked with the highest pheromone concentration.
|
||||
\begin{itemize}
|
||||
\item Nodes are cities.
|
||||
\item Edges are connections between cities.
|
||||
\item A solution is an Hamiltonian path in the graph.
|
||||
\item A solution is a Hamiltonian path in the graph.
|
||||
\item Constraints to avoid sub-cycles (i.e. avoid visiting a city multiple times).
|
||||
\end{itemize}
|
||||
\end{example}
|
||||
@ -123,7 +123,7 @@ The algorithm has the following phases:
|
||||
\item[Initialization]
|
||||
The initial nectar source of each bee is determined randomly.
|
||||
Each solution (nectar source) is a vector $\vec{x}_m \in \mathbb{R}^n$ and
|
||||
each of its component is initialized constrained to a lower ($l_i$) and upper ($u_i$) bound:
|
||||
each of its components is initialized constrained to a lower ($l_i$) and upper ($u_i$) bound:
|
||||
\[ \vec{x}_m\texttt{[}i\texttt{]} = l_i + \texttt{rand}(0, 1) \cdot (u_i - l_i) \]
|
||||
|
||||
\item[Employed bees]
|
||||
@ -139,7 +139,7 @@ The algorithm has the following phases:
|
||||
Onlooker bees stochastically choose their food source.
|
||||
Each food source $\vec{x}_m$ has a probability associated to it defined as:
|
||||
\[ p_m = \frac{\texttt{fit}(\vec{x}_m)}{\sum_{i=1}^{n_\text{bees}} \texttt{fit}(\vec{x}_i)} \]
|
||||
This provides a positive feedback as more promising solutions have a higher probability to be chosen.
|
||||
This provides a positive feedback as more promising solutions have a higher probability of being chosen.
|
||||
|
||||
\item[Scout bees]
|
||||
Scout bees choose a nectar source randomly.
|
||||
@ -166,7 +166,7 @@ The algorithm has the following phases:
|
||||
\section{Particle swarm optimization (PSO)}
|
||||
\marginnote{Particle swarm optimization (PSO)}
|
||||
|
||||
In a bird flock, the movement of the individuals tend to:
|
||||
In a bird flock, the movement of the individuals tends to:
|
||||
\begin{itemize}
|
||||
\item Follow the neighbors.
|
||||
\item Stay in the flock.
|
||||
@ -174,8 +174,8 @@ In a bird flock, the movement of the individuals tend to:
|
||||
\end{itemize}
|
||||
However, a model based on these rules does not have a common objective.
|
||||
|
||||
PSO introduces as common objective the search of food.
|
||||
Each individual that finds food can:
|
||||
PSO introduces as a common objective the search for food.
|
||||
Each individual who finds food can:
|
||||
\begin{itemize}
|
||||
\item Move away from the flock and reach the food.
|
||||
\item Stay in the flock.
|
||||
@ -197,7 +197,7 @@ Applied to optimization problems, the bird flock metaphor can be interpreted as:
|
||||
\end{descriptionlist}
|
||||
|
||||
Given a cost function $f: \mathbb{R}^n \rightarrow \mathbb{R}$ to minimize (gradient is not known),
|
||||
PSO initializes a swarm of particles (agents) whose movement is guided by the best known position.
|
||||
PSO initializes a swarm of particles (agents) whose movement is guided by the best-known position.
|
||||
Each particle is described by:
|
||||
\begin{itemize}
|
||||
\item Its position $\vec{x}_i \in \mathbb{R}^n$ in the search space.
|
||||
|
||||
Binary file not shown.
Binary file not shown.
@ -7,7 +7,7 @@
|
||||
\item[Business process management] \marginnote{Business process management}
|
||||
Methods to design, manage and analyze business processes by mining data contained in information systems.
|
||||
|
||||
Business processes help in making decisions and automations.
|
||||
Business processes help in making decisions and automation.
|
||||
|
||||
\item[Business process lifecycle] \phantom{}
|
||||
\begin{description}
|
||||
@ -50,7 +50,7 @@
|
||||
|
||||
|
||||
|
||||
\section{Business process modelling}
|
||||
\section{Business process modeling}
|
||||
|
||||
\begin{description}
|
||||
\item[Activity instance] \marginnote{Activity instance}
|
||||
@ -73,30 +73,30 @@
|
||||
\end{description}
|
||||
|
||||
|
||||
\subsection{Control flow modelling}
|
||||
\subsection{Control flow modeling}
|
||||
|
||||
\begin{description}
|
||||
\item[Process modelling types] \phantom{}
|
||||
\item[Process modeling types] \phantom{}
|
||||
\begin{description}
|
||||
\item[Procedural vs declarative] \phantom{}
|
||||
\begin{description}
|
||||
\item[Procedural] \marginnote{Procedural modelling}
|
||||
\item[Procedural] \marginnote{Procedural modeling}
|
||||
Based on a strict ordering of the steps.
|
||||
Uses conditional choices, loops, parallel execution, events.
|
||||
|
||||
Subject to the spaghetti-like process problem.
|
||||
|
||||
\item[Declarative] \marginnote{Declarative modelling}
|
||||
\item[Declarative] \marginnote{Declarative modeling}
|
||||
Based on the properties that should hold during execution.
|
||||
Uses concepts as: executions, expected executions, prohibited executions.
|
||||
Uses concepts such as executions, expected executions, prohibited executions.
|
||||
\end{description}
|
||||
|
||||
\item[Closed vs open] \phantom{}
|
||||
\begin{description}
|
||||
\item[Closed] \marginnote{Closed modelling}
|
||||
\item[Closed] \marginnote{Closed modeling}
|
||||
The execution of non-modelled activities is prohibited.
|
||||
|
||||
\item[Open] \marginnote{Open modelling}
|
||||
\item[Open] \marginnote{Open modeling}
|
||||
Constraints to allow non-modelled activities.
|
||||
\end{description}
|
||||
\end{description}
|
||||
@ -104,13 +104,13 @@
|
||||
|
||||
The most common combination of approaches are:
|
||||
\begin{descriptionlist}
|
||||
\item[Closed procedural process modelling]
|
||||
\item[Open declarative process modelling]
|
||||
\item[Closed procedural process modeling]
|
||||
\item[Open declarative process modeling]
|
||||
\end{descriptionlist}
|
||||
|
||||
|
||||
|
||||
\section{Closed procedural process modelling}
|
||||
\section{Closed procedural process modeling}
|
||||
|
||||
\begin{description}
|
||||
\item[Process model]
|
||||
@ -230,14 +230,14 @@ The most common combination of approaches are:
|
||||
\begin{table}[H]
|
||||
\centering
|
||||
\begin{tabular}{c|c}
|
||||
\textbf{Petri nets} & \textbf{Business process modelling} \\
|
||||
\textbf{Petri nets} & \textbf{Business process modeling} \\
|
||||
\hline
|
||||
Petri net & Process model \\
|
||||
Transitions & Activity models \\
|
||||
Tokens & Instances \\
|
||||
Transition firing & Activity execution \\
|
||||
\end{tabular}
|
||||
\caption{Petri nets and business process modelling concepts equivalence}
|
||||
\caption{Petri nets and business process modeling concepts equivalence}
|
||||
\end{table}
|
||||
|
||||
|
||||
@ -299,7 +299,7 @@ De-facto standard for business process representation.
|
||||
Drawn as a thin-bordered circle.
|
||||
|
||||
\item[Intermediate event]
|
||||
Event occurring after the start of a process, but before its end.
|
||||
Event occurring after the start of a process but before its end.
|
||||
|
||||
\item[End event]
|
||||
Indicates the end of a process and optionally provides its result.
|
||||
@ -355,7 +355,7 @@ De-facto standard for business process representation.
|
||||
|
||||
|
||||
|
||||
\section{Open declarative process modelling}
|
||||
\section{Open declarative process modeling}
|
||||
|
||||
Define formal properties for process models (i.e. more formal than procedural methods).
|
||||
Properties defined in term of the evolution of the process (similar to the evolution of the world in modal logics)
|
||||
@ -423,7 +423,7 @@ Based on constraints that must hold in every possible execution of the system.
|
||||
\end{description}
|
||||
|
||||
\item[Semantics]
|
||||
The semantic of the constraints can be defined using LTL.
|
||||
The semantics of the constraints can be defined using LTL.
|
||||
|
||||
\item[Verifiable properties] \phantom{}
|
||||
\begin{description}
|
||||
@ -497,7 +497,7 @@ Based on constraints that must hold in every possible execution of the system.
|
||||
\item[Process discovery] \marginnote{Process discovery}
|
||||
Learn a process model representative of the input event log.
|
||||
|
||||
More formally, a process discovery algorithm is a function that maps an event log into a business process modelling language.
|
||||
More formally, a process discovery algorithm is a function that maps an event log into a business process modeling language.
|
||||
In our case, we map logs into Petri nets (preferably workflow nets).
|
||||
|
||||
\begin{remark}
|
||||
@ -591,7 +591,7 @@ Based on constraints that must hold in every possible execution of the system.
|
||||
\item[Model evaluation]
|
||||
Different models can capture the same process described in a log.
|
||||
This allows for models that are capable of capturing all the possible traces of a log but
|
||||
are unable provide any insight (e.g. flower Petri net).
|
||||
are unable to provide any insight (e.g. flower Petri net).
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
@ -608,7 +608,7 @@ Based on constraints that must hold in every possible execution of the system.
|
||||
\item[Precision] \marginnote{Precision}
|
||||
How the model is able to capture rare cases.
|
||||
\item[Generalization] \marginnote{Generalization}
|
||||
How the model generalize on the training traces.
|
||||
How the model generalizes on the training traces.
|
||||
\end{descriptionlist}
|
||||
\end{description}
|
||||
|
||||
@ -618,7 +618,7 @@ Based on constraints that must hold in every possible execution of the system.
|
||||
|
||||
\begin{description}
|
||||
\item[Descriptive model discrepancies] \marginnote{Descriptive model}
|
||||
The model need to be improved.
|
||||
The model needs to be improved.
|
||||
|
||||
\item[Prescriptive model discrepancies] \marginnote{Prescriptive model}
|
||||
The traces need to be checked as the model cannot be changed (e.g. model of the law).
|
||||
@ -632,14 +632,14 @@ Based on constraints that must hold in every possible execution of the system.
|
||||
\begin{description}
|
||||
\item[Token replay] \marginnote{Token replay}
|
||||
Given a trace and a Petri net, the trace is replayed on the model by moving tokens around.
|
||||
The trace is conform if the end event can be reached, otherwise it is not.
|
||||
The trace is conform if the end event can be reached, otherwise, it is not.
|
||||
|
||||
A modified version of token replay allows to add or remove tokens when the trace is stuck on the Petri net.
|
||||
These external interventions are tracked and used to compute a fitness score (i.e. degree of conformance).
|
||||
|
||||
Limitations:
|
||||
\begin{itemize}
|
||||
\item Fitness tend to be high for extremely problematic logs.
|
||||
\item Fitness tends to be high for extremely problematic logs.
|
||||
\item If there are too many deviations, the model is flooded with tokens and may result in unexpected behaviors.
|
||||
\item It is a Petri net specific algorithm.
|
||||
\end{itemize}
|
||||
|
||||
@ -251,12 +251,12 @@ The following algorithms can be employed:
|
||||
\subsection{Open world assumption}
|
||||
|
||||
\begin{description}
|
||||
\item[Open world assumption] \marginnote{Open world assumption}
|
||||
If a sentence cannot be inferred, its truth values is unknown.
|
||||
\item[Open-world assumption] \marginnote{Open-world assumption}
|
||||
If a sentence cannot be inferred, its truth value is unknown.
|
||||
\end{description}
|
||||
|
||||
Description logics are based on the open world assumption.
|
||||
To reason in open world assumption, all the possible models are split upon encountering an unknown facts
|
||||
Description logics are based on the open-world assumption.
|
||||
To reason in open world assumption, all the possible models are split upon encountering unknown facts
|
||||
depending on the possible cases (Oedipus example).
|
||||
|
||||
|
||||
|
||||
@ -45,7 +45,7 @@ RETE is an efficient algorithm for implementing rule-based systems.
|
||||
A pattern can test:
|
||||
\begin{descriptionlist}
|
||||
\item[Intra-element features] Features that can be tested directly on a fact.
|
||||
\item[Inter-element features] Features that involves more facts.
|
||||
\item[Inter-element features] Features that involve more facts.
|
||||
\end{descriptionlist}
|
||||
|
||||
\item[Conflict set] \marginnote{Conflict set}
|
||||
@ -60,11 +60,11 @@ RETE is an efficient algorithm for implementing rule-based systems.
|
||||
\begin{descriptionlist}
|
||||
\item[Alpha-network] \marginnote{Alpha-network}
|
||||
For intra-element features.
|
||||
The outcome is stored into alpha-memories and used by the beta network.
|
||||
The outcome is stored in alpha-memories and used by the beta network.
|
||||
|
||||
\item[Beta-network] \marginnote{Beta-network}
|
||||
For inter-element features.
|
||||
The outcome is stored into beta-memories and corresponds to the conflict set.
|
||||
The outcome is stored in beta-memories and corresponds to the conflict set.
|
||||
\end{descriptionlist}
|
||||
If more rules use the same pattern, the node of that pattern is reused and possibly outputting to different memories.
|
||||
\end{description}
|
||||
@ -83,7 +83,7 @@ The best approach depends on the use case.
|
||||
|
||||
\subsection{Execution}
|
||||
By default, RETE executes all the rules in the agenda and
|
||||
then checks possible side effects that modified the working memory in a second moment.
|
||||
then checks for possible side effects that modify the working memory in a second moment.
|
||||
|
||||
Note that it is very easy to create loops.
|
||||
|
||||
@ -162,7 +162,7 @@ RETE-based rule engine that uses Java.
|
||||
Event detected outside an event processing system (e.g. a sensor). It does not provide any information alone.
|
||||
|
||||
\item[Complex event] \marginnote{Complex event}
|
||||
Event generated by an event processing system and provides higher informative payload.
|
||||
Event generated by an event processing system and provides a higher informative payload.
|
||||
|
||||
\item[Complex event processing (CEP)] \marginnote{Complex event processing}
|
||||
Paradigm for dealing with a large amount of information.
|
||||
@ -185,7 +185,7 @@ Drools supports CEP by representing events as facts.
|
||||
\end{description}
|
||||
|
||||
\item[Expiration]
|
||||
Mechanism to specify an expiration time to events and discard them from the working memory.
|
||||
Mechanism to specify an expiration time for events and discard them from the working memory.
|
||||
|
||||
\item[Temporal reasoning]
|
||||
Allen's temporal operators for temporal reasoning.
|
||||
|
||||
@ -17,7 +17,7 @@
|
||||
Properties:
|
||||
\begin{itemize}
|
||||
\item Should be applicable to almost any special domain.
|
||||
\item Combining general concepts should not incur in inconsistences.
|
||||
\item Combining general concepts should not incur in inconsistencies.
|
||||
\end{itemize}
|
||||
|
||||
Approaches to create ontologies:
|
||||
@ -35,7 +35,7 @@
|
||||
\item[Category] \marginnote{Category}
|
||||
Used in human reasoning when the goal is category-driven (in contrast to specific-instance-driven).
|
||||
|
||||
In first order logic, categories can be represented through:
|
||||
In first-order logic, categories can be represented through:
|
||||
\begin{descriptionlist}
|
||||
\item[Predicate] \marginnote{Predicate categories}
|
||||
A predicate to tell if an object belongs to a category
|
||||
@ -158,7 +158,7 @@ A property of objects.
|
||||
|
||||
\section{Semantic networks}
|
||||
\marginnote{Semantic networks}
|
||||
Graphical representation of objects and categories connected through labelled links.
|
||||
Graphical representation of objects and categories connected through labeled links.
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
@ -189,7 +189,7 @@ Graphical representation of objects and categories connected through labelled li
|
||||
|
||||
\begin{description}
|
||||
\item[Limitations]
|
||||
Compared to first order logic, semantic networks do not have:
|
||||
Compared to first-order logic, semantic networks do not have:
|
||||
\begin{itemize}
|
||||
\item Negations.
|
||||
\item Universally and existentially quantified properties.
|
||||
@ -202,7 +202,7 @@ Graphical representation of objects and categories connected through labelled li
|
||||
This approach is powerful but does not have a corresponding logical meaning.
|
||||
|
||||
\item[Advantages]
|
||||
With semantic networks it is easy to attach default properties to categories and
|
||||
With semantic networks, it is easy to attach default properties to categories and
|
||||
override them on the objects (i.e. \texttt{Legs} of \texttt{John}).
|
||||
\end{description}
|
||||
|
||||
@ -213,7 +213,7 @@ Graphical representation of objects and categories connected through labelled li
|
||||
Knowledge that describes an object in terms of its properties.
|
||||
Each frame has:
|
||||
\begin{itemize}
|
||||
\item An unique name
|
||||
\item A unique name
|
||||
\item Properties represented as pairs \texttt{<slot - filler>}
|
||||
\end{itemize}
|
||||
|
||||
|
||||
@ -3,7 +3,7 @@
|
||||
\begin{description}
|
||||
\item[Probabilistic logic programming] \marginnote{Probabilistic logic programming}
|
||||
Adds probability distributions over logic programs allowing to define different worlds.
|
||||
Joint distributions can also be defined over worlds and allows to answer to queries.
|
||||
Joint distributions can also be defined over worlds and allow to answer to queries.
|
||||
\end{description}
|
||||
|
||||
|
||||
|
||||
@ -44,7 +44,7 @@ It may be useful to first have a look at the "Logic programming" section of
|
||||
Variables appearing in a fact are quantified universally.
|
||||
\[ \texttt{A(X).} \equiv \forall \texttt{X}: \texttt{A(X)} \]
|
||||
\item[Rules]
|
||||
Variables appearing the the body only are quantified existentially.
|
||||
Variables appearing in the body only are quantified existentially.
|
||||
Variables appearing in both the head and the body are quantified universally.
|
||||
\[ \texttt{A(X) :- B(X, Y).} \equiv \forall \texttt{X}, \exists \texttt{Y} : \texttt{A(X)} \Leftarrow \texttt{B(X, Y)} \]
|
||||
|
||||
@ -72,7 +72,7 @@ It may be useful to first have a look at the "Logic programming" section of
|
||||
\end{descriptionlist}
|
||||
|
||||
\item[SLD resolution] \marginnote{SLD}
|
||||
Prolog uses SLD resolution with the following choices:
|
||||
Prolog uses a SLD resolution with the following choices:
|
||||
\begin{descriptionlist}
|
||||
\item[Left-most] Always proves the left-most literal first.
|
||||
\item[Depth-first] Applies the predicates following the order of definition.
|
||||
@ -204,7 +204,7 @@ Therefore, if \texttt{qj, \dots, qn} fails, there won't be backtracking and \tex
|
||||
Adding new axioms to the program may change the set of valid theorems.
|
||||
\end{description}
|
||||
|
||||
As first-order logic in undecidable, closed-world assumption cannot be directly applied in practice.
|
||||
As first-order logic is undecidable, the closed-world assumption cannot be directly applied in practice.
|
||||
|
||||
\item[Negation as failure] \marginnote{Negation as failure}
|
||||
A negated atom $\lnot A$ is considered true iff $A$ fails in finite time:
|
||||
@ -222,9 +222,8 @@ Therefore, if \texttt{qj, \dots, qn} fails, there won't be backtracking and \tex
|
||||
\begin{itemize}
|
||||
\item If \texttt{L$_i$} is positive, apply the normal SLD resolution.
|
||||
\item If \texttt{L$_i$} = $\lnot A$, prove that $A$ fails in finite time.
|
||||
If it succeeds, \texttt{L$_i$} fails.
|
||||
\end{itemize}
|
||||
\item Solve the goal \texttt{:- L$_1$, \dots, L$_{i-1}$, L$_{i+1}$, \dots L$_m$}.
|
||||
\item Solve the remaining goal \texttt{:- L$_1$, \dots, L$_{i-1}$, L$_{i+1}$, \dots, L$_m$}.
|
||||
\end{enumerate}
|
||||
|
||||
\begin{theorem}
|
||||
@ -407,7 +406,7 @@ father(mario, paola).
|
||||
The operator \texttt{T =.. L} unifies \texttt{L} with a list where
|
||||
its head is the head of \texttt{T} and the tail contains the remaining arguments of \texttt{T}
|
||||
(i.e. puts all the components of a predicate into a list).
|
||||
Only one between \texttt{T} and \texttt{L} may be a variable.
|
||||
Only one between \texttt{T} and \texttt{L} can be a variable.
|
||||
|
||||
\begin{example} \phantom{} \\
|
||||
\begin{minipage}{0.5\textwidth}
|
||||
@ -458,7 +457,7 @@ father(mario, paola).
|
||||
|
||||
Note that \texttt{:- assert((p(X)))} quantifies \texttt{X} existentially as it is a query.
|
||||
If it is not ground and added to the database as is,
|
||||
is becomes a clause and therefore quantified universally: $\forall \texttt{X}: \texttt{p(X)}$.
|
||||
it becomes a clause and therefore quantified universally: $\forall \texttt{X}: \texttt{p(X)}$.
|
||||
|
||||
\begin{example}[Lemma generation] \phantom{}
|
||||
\begin{lstlisting}[language={}]
|
||||
@ -473,7 +472,7 @@ father(mario, paola).
|
||||
generate_lemma(T) :- assert(T).
|
||||
\end{lstlisting}
|
||||
|
||||
\texttt{generate\_lemma/1} allows to add to the clauses database all the intermediate steps to compute the Fibonacci sequence
|
||||
The custom defined \texttt{generate\_lemma/1} allows to add to the clauses database all the intermediate steps to compute the Fibonacci sequence
|
||||
(similar concept to dynamic programming).
|
||||
\end{example}
|
||||
|
||||
|
||||
@ -19,7 +19,7 @@
|
||||
|
||||
\item[Uniform resource identifier] \marginnote{URI}
|
||||
Naming system to uniquely identify concepts.
|
||||
Each URI correspond to one and only one concept, but multiple URIs can refer to the same concept.
|
||||
Each URI corresponds to one and only one concept, but multiple URIs can refer to the same concept.
|
||||
|
||||
\item[XML] \marginnote{XML}
|
||||
Markup language to represent hierarchically structured data.
|
||||
@ -74,7 +74,7 @@ xmlns:contact=http://www.w3.org/2000/10/swap/pim/contact#>
|
||||
\item[Database similarities]
|
||||
RDF aims to integrate different databases:
|
||||
\begin{itemize}
|
||||
\item A DB record is a RDF node.
|
||||
\item A DB record is an RDF node.
|
||||
\item The name of a column can be seen as a property type.
|
||||
\item The value of a field corresponds to the value of a property.
|
||||
\end{itemize}
|
||||
@ -87,8 +87,8 @@ xmlns:contact=http://www.w3.org/2000/10/swap/pim/contact#>
|
||||
Language to query different data sources that support RDF (natively or through a middleware).
|
||||
|
||||
\item[Ontology web language (OWL)] \marginnote{Ontology web language (OWL)}
|
||||
Ontology based on RDF and description logic fragments.
|
||||
Three level of expressivity are available:
|
||||
Ontology-based on RDF and description logic fragments.
|
||||
Three levels of expressivity are available:
|
||||
\begin{itemize}
|
||||
\item OWL lite.
|
||||
\item OWL DL.
|
||||
|
||||
@ -5,12 +5,12 @@
|
||||
|
||||
\begin{description}
|
||||
\item[State] \marginnote{State}
|
||||
The current state of the world can be represented as a set of propositions that are true according the observation of an agent.
|
||||
The current state of the world can be represented as a set of propositions that are true according to the observation of an agent.
|
||||
|
||||
The union of a countable sequence of states represents the evolution of the world. Each proposition is distinguished by its time step.
|
||||
|
||||
\begin{example}
|
||||
A child has a bow and an arrow, then shoots the arrow.
|
||||
A child has a bow and an arrow and then shoots the arrow.
|
||||
\[
|
||||
\begin{split}
|
||||
\text{KB}^0 &= \{ \texttt{hasBow}^0, \texttt{hasArrow}^0 \} \\
|
||||
@ -51,7 +51,7 @@
|
||||
|
||||
|
||||
\section{Situation calculus (Green's formulation)}
|
||||
Situation calculus uses first order logic instead of propositional logic.
|
||||
Situation calculus uses first-order logic instead of propositional logic.
|
||||
|
||||
\begin{description}
|
||||
\item[Situation] \marginnote{Situation}
|
||||
@ -142,8 +142,8 @@ Event calculus reifies fluents and events (actions) as terms (instead of predica
|
||||
\begin{description}
|
||||
\item[Deductive reasoning]
|
||||
Event calculus only allows deductive reasoning:
|
||||
it takes as input the domain-dependant axioms and a set of events, and computes a set of true fluents.
|
||||
If a new event is observed, the query need to be recomputed again.
|
||||
it takes as input the domain-dependant axioms and a set of events and computes a set of true fluents.
|
||||
If a new event is observed, the query needs to be recomputed again.
|
||||
\end{description}
|
||||
|
||||
|
||||
@ -183,7 +183,7 @@ Allows to add events dynamically without the need to recompute the result.
|
||||
|
||||
\section{Allen's logic of intervals}
|
||||
|
||||
Event calculus only captures instantaneous events that happen in given points in time.
|
||||
Event calculus only captures instantaneous events that happen at given points in time.
|
||||
|
||||
\begin{description}
|
||||
\item[Allen's logic of intervals] \marginnote{Allen's logic of intervals}
|
||||
@ -217,7 +217,7 @@ Event calculus only captures instantaneous events that happen in given points in
|
||||
|
||||
\section{Modal logics}
|
||||
|
||||
Logic based on interacting agents with their own knowledge base.
|
||||
Logic-based on interacting agents with their own knowledge base.
|
||||
|
||||
\begin{description}
|
||||
\item[Propositional attitudes] \marginnote{Propositional attitudes}
|
||||
@ -226,7 +226,7 @@ Logic based on interacting agents with their own knowledge base.
|
||||
First-order logic is not suited to represent these operators.
|
||||
|
||||
\item[Modal logics] \marginnote{Modal logics}
|
||||
Modal logics have the same syntax of first-order logic with the addition of modal operators.
|
||||
Modal logics have the same syntax as first-order logic with the addition of modal operators.
|
||||
|
||||
\item[Modal operator]
|
||||
A modal operator takes as input the name of an agent and a sentence (instead of a term as in FOL).
|
||||
@ -260,7 +260,7 @@ Logic based on interacting agents with their own knowledge base.
|
||||
\end{itemize}
|
||||
|
||||
\begin{example}
|
||||
Alice is in a room an tosses a coin. Bob is in another room an will enter Alice's room when the coin lands to observe the result.
|
||||
Alice is in a room and tosses a coin. Bob is in another room and will enter Alice's room when the coin lands to observe the result.
|
||||
|
||||
We define a model $M = (S, \pi, K_\texttt{a}, K_\texttt{b})$ on $\phi$ where:
|
||||
\begin{itemize}
|
||||
@ -359,7 +359,7 @@ The accessibility relation maps into the temporal dimension with two possible ev
|
||||
\end{description}
|
||||
|
||||
\item[Semantics]
|
||||
Given a Kripke structure $M = (S, \pi, K_\texttt{1}, \dots, K_\texttt{n})$ where states are represented using integers,
|
||||
Given a Kripke structure, $M = (S, \pi, K_\texttt{1}, \dots, K_\texttt{n})$ where states are represented using integers,
|
||||
the semantic of the operators is the following:
|
||||
\begin{itemize}
|
||||
\item $(M, i) \models P \iff i \in \pi(P)$.
|
||||
@ -370,5 +370,5 @@ The accessibility relation maps into the temporal dimension with two possible ev
|
||||
\end{itemize}
|
||||
|
||||
\item[Model checking] \marginnote{Model checking}
|
||||
Methods to prove properties of linear-time temporal logic based finite state machines or distributed systems.
|
||||
Methods to prove properties of linear-time temporal logic-based finite state machines or distributed systems.
|
||||
\end{description}
|
||||
@ -44,7 +44,7 @@
|
||||
\item[Goal] \marginnote{Goal}
|
||||
$G := \top \mid \bot \mid A \mid C \mid G_1 \land G_2$
|
||||
\item[Constraint logic clause] \marginnote{Constraint logic clause}
|
||||
$K := A \leftarrow G$
|
||||
$K := A \Leftarrow G$
|
||||
\item[Constraint logic program] \marginnote{Constraint logic program}
|
||||
$P := K_1 \dots K_m$, for $m \geq 0$
|
||||
\end{description}
|
||||
@ -67,17 +67,17 @@
|
||||
Starting from the state $\langle A \land G, C \rangle$ of a program $P$, a transition on the atom $A$ can result in:
|
||||
\begin{description}
|
||||
\item[Unfold] \marginnote{Unfold}
|
||||
If there exists a clause $(B \leftarrow H)$ in $P$ and
|
||||
If there exists a clause $(B \Leftarrow H)$ in $P$ and
|
||||
an assignment $(B \doteq A)$ such that $((B \doteq A) \land C)$ is still valid,
|
||||
then we have a transition $\langle A \land G, C \rangle \mapsto \langle H \land G, (B \doteq A) \land C \rangle$.
|
||||
|
||||
In other words, we want to develop an atom $A$ and the current constraints are denoted as $C$.
|
||||
We look for a clause whose head equals $A$, applying an assignment if needed.
|
||||
If this is possible, we transit from solving $A$ to solving the body of the clause and
|
||||
If this is possible, we transition from solving $A$ to solving the body of the clause and
|
||||
add the assignment to the set of active constraints.
|
||||
|
||||
\item[Failure] \marginnote{Failure}
|
||||
If there are no clauses $(B \leftarrow H)$ with a valid assignment $((B \doteq A) \land C)$,
|
||||
If there are no clauses $(B \Leftarrow H)$ with a valid assignment $((B \doteq A) \land C)$,
|
||||
then we have a transition $\langle A \land G, C \rangle \mapsto \langle \bot, \bot \rangle$.
|
||||
\end{description}
|
||||
|
||||
@ -100,7 +100,7 @@
|
||||
\begin{description}
|
||||
\item[Generate-and-test] \marginnote{Generate-and-test}
|
||||
Strategy adopted by logic programs.
|
||||
Every possible assignment to the variables are generated and tested.
|
||||
Every possible assignment to the variables is generated and tested.
|
||||
|
||||
The workflow is the following:
|
||||
\begin{enumerate}
|
||||
|
||||
@ -1,4 +1,4 @@
|
||||
\chapter{First order logic}
|
||||
\chapter{First-order logic}
|
||||
|
||||
|
||||
\section{Syntax}
|
||||
@ -12,12 +12,12 @@ The symbols of propositional logic are:
|
||||
Unknown elements of the domain. Do not represent truth values.
|
||||
|
||||
\item[Function symbols]
|
||||
Function $f^{(n)}$ applied on $n$ constants to obtain another constant.
|
||||
Function $f^{(n)}$ applied on $n$ elements of the domain to obtain another element of the domain.
|
||||
|
||||
\item[Predicate symbols]
|
||||
Function $P^{(n)}$ applied on $n$ constants to obtain a truth value.
|
||||
Function $P^{(n)}$ applied on $n$ elements of the domain to obtain a truth value.
|
||||
|
||||
\item[Connectives] $\forall$ $\exists$ $\land$ $\vee$ $\rightarrow$ $\lnot$ $\leftrightarrow$ $\top$ $\bot$ $($ $)$
|
||||
\item[Connectives] $\forall$ $\exists$ $\land$ $\vee$ $\Rightarrow$ $\lnot$ $\Leftrightarrow$ $\top$ $\bot$ $($ $)$
|
||||
\end{descriptionlist}
|
||||
|
||||
Using the basic syntax, the following constructs can be defined:
|
||||
@ -27,7 +27,7 @@ Using the basic syntax, the following constructs can be defined:
|
||||
|
||||
\item[Proposition] Denotes truth values.
|
||||
\[
|
||||
P := \top \,|\, \bot \,|\, P \land P \,|\, P \vee P \,|\, P \rightarrow P \,|\, P \leftrightarrow P \,|\,
|
||||
P := \top \,|\, \bot \,|\, P \land P \,|\, P \vee P \,|\, P \Rightarrow P \,|\, P \Leftrightarrow P \,|\,
|
||||
\lnot P \,|\, \forall x. P \,|\, \exists x. P \,|\, (P) \,|\, P^{(n)}(t_1, \dots, t_n)
|
||||
\]
|
||||
\end{descriptionlist}
|
||||
@ -35,7 +35,7 @@ Using the basic syntax, the following constructs can be defined:
|
||||
|
||||
\begin{description}
|
||||
\item[Well-formed formula] \marginnote{Well-formed formula}
|
||||
The definition of well-formed formula in first order logic extends the one of
|
||||
The definition of well-formed formula in first-order logic extends the one of
|
||||
propositional logic by adding the following conditions:
|
||||
\begin{itemize}
|
||||
\item If S is well-formed, $\exists X. S$ is well-formed. Where $X$ is a variable.
|
||||
@ -44,13 +44,13 @@ Using the basic syntax, the following constructs can be defined:
|
||||
|
||||
\item[Free variables] \marginnote{Free variables}
|
||||
The universal and existential quantifiers bind their variable within the scope of the formula.
|
||||
Let $F_v(F)$ be the set of free variables in a formula $F$, $F_v$ is defined as follows:
|
||||
Let $\mathcal{F}_v(F)$ be the set of free variables in a formula $F$, $\mathcal{F}_v$ is defined as follows:
|
||||
\begin{itemize}
|
||||
\item $F_v(p(t)) = \bigcup \texttt{vars}(t)$
|
||||
\item $F_v(\top) = F_v(\bot) = \varnothing$
|
||||
\item $F_v(\lnot F) = F_v(F)$
|
||||
\item $F_v(F_1 \land F_2) = F_v(F_1 \vee F_2) = F_v(F_1 \rightarrow F_2) = F_v(F_1) \cup F_v(F_2)$
|
||||
\item $F_v(\forall X.F) = F_v(\exists X.F) = F_v(F) \smallsetminus \{ X \}$
|
||||
\item $\mathcal{F}_v(p(t)) = \bigcup \{ \text{variables of $t$} \}$
|
||||
\item $\mathcal{F}_v(\top) = \mathcal{F}_v(\bot) = \varnothing$
|
||||
\item $\mathcal{F}_v(\lnot F) = \mathcal{F}_v(F)$
|
||||
\item $\mathcal{F}_v(F_1 \land F_2) = \mathcal{F}_v(F_1 \vee F_2) = \mathcal{F}_v(F_1 \Rightarrow F_2) = \mathcal{F}_v(F_1) \cup \mathcal{F}_v(F_2)$
|
||||
\item $\mathcal{F}_v(\forall X.F) = \mathcal{F}_v(\exists X.F) = \mathcal{F}_v(F) \smallsetminus \{ X \}$
|
||||
\end{itemize}
|
||||
|
||||
\begin{description}
|
||||
@ -60,7 +60,7 @@ Using the basic syntax, the following constructs can be defined:
|
||||
\item[Theory] \marginnote{Theory}
|
||||
Set of sentences.
|
||||
|
||||
\item[Ground term/Formula] \marginnote{Formula}
|
||||
\item[Ground term/Ground formula] \marginnote{Ground term/Ground formula}
|
||||
Proposition without variables.
|
||||
\end{description}
|
||||
\end{description}
|
||||
@ -71,13 +71,13 @@ Using the basic syntax, the following constructs can be defined:
|
||||
|
||||
\begin{description}
|
||||
\item[Interpretation] \marginnote{Interpretation}
|
||||
An interpretation in first order logic $\mathcal{I}$ is a pair $(D, I)$:
|
||||
An interpretation in first-order logic $\mathcal{I}$ is a pair $(D, I)$:
|
||||
\begin{itemize}
|
||||
\item $D$ is the domain of the terms.
|
||||
\item $I$ is the interpretation function such that:
|
||||
\begin{itemize}
|
||||
\item $I(f): D^n \rightarrow D$ for every n-ary function symbol.
|
||||
\item $I(p) \subseteq D^n$ for every n-ary predicate symbol.
|
||||
\item The interpretation of an n-ary function symbol is a function $I(f): D^n \rightarrow D$.
|
||||
\item The interpretation of an n-ary predicate symbol is a relation $I(p) \subseteq D^n$.
|
||||
\end{itemize}
|
||||
\end{itemize}
|
||||
|
||||
@ -101,22 +101,22 @@ Using the basic syntax, the following constructs can be defined:
|
||||
\item[Logical consequence] \marginnote{Logical consequence}
|
||||
A sentence $T_1$ is a logical consequence of $T_2$ ($T_2 \models T_1$) if
|
||||
every model of $T_2$ is also model of $T_1$:
|
||||
\[ \mathcal{I} \models T_2 \rightarrow \mathcal{I} \models T_1 \]
|
||||
\[ \mathcal{I} \models T_2 \Rightarrow \mathcal{I} \models T_1 \]
|
||||
|
||||
\begin{theorem}
|
||||
It is undecidable to determine if a first order logic formula is a tautology.
|
||||
Determining if a first-order logic formula is a tautology is undecidable.
|
||||
\end{theorem}
|
||||
|
||||
\item[Equivalence] \marginnote{Equivalence}
|
||||
A sentence $T_1$ is equivalent to $T_2$ if $T_1 \models T_2$ and $T_2 \models T_1$.
|
||||
A sentence $T_1$ is equivalent to $T_2$ iff $T_1 \models T_2$ and $T_2 \models T_1$.
|
||||
\end{description}
|
||||
|
||||
\begin{theorem}
|
||||
The following statements are equivalent:
|
||||
\begin{enumerate}
|
||||
\item $F_1, \dots, F_n \models G$.
|
||||
\item $(\bigwedge_{i=1}^{n} F_i) \rightarrow G$ is valid.
|
||||
\item $(\bigwedge_{i=1}^{n} F_i) \land \lnot G$ is unsatisfiable.
|
||||
\item $F_1 \land \dots \land F_n \Rightarrow G$ is valid (i.e. deduction).
|
||||
\item $F_1 \land \dots \land F_n \land \lnot G$ is unsatisfiable (i.e. refutation).
|
||||
\end{enumerate}
|
||||
\end{theorem}
|
||||
|
||||
@ -125,7 +125,7 @@ Using the basic syntax, the following constructs can be defined:
|
||||
|
||||
\begin{description}
|
||||
\item[Substitution] \marginnote{Substitution}
|
||||
A substitution $\sigma: \mathcal{V} \rightarrow \mathcal{T}$ is a mapping from variables to terms.
|
||||
A substitution $\sigma: \mathcal{V} \Rightarrow \mathcal{T}$ is a mapping from variables to terms.
|
||||
It is written as $\{ X_1 \mapsto t_1, \dots, X_n \mapsto t_n \}$.
|
||||
|
||||
The application of a substitution is the following:
|
||||
@ -134,8 +134,8 @@ Using the basic syntax, the following constructs can be defined:
|
||||
\item $f(t_1, \dots, t_n)\sigma = fp(t_1\sigma, \dots, t_n\sigma)$
|
||||
\item $\bot\sigma = \bot$ and $\top\sigma = \top$
|
||||
\item $(\lnot F)\sigma = (\lnot F\sigma)$
|
||||
\item $(F_1 \star F_2)\sigma = (F_1\sigma \star F_2\sigma)$ for $\star \in \{ \land, \vee, \rightarrow \}$
|
||||
\item $(\forall X.F)\sigma = \forall X' (F \sigma[X \mapsto X'])$ where $X'$ is a fresh variable (i.e. does not appear in $F$).
|
||||
\item $(F_1 \star F_2)\sigma = (F_1\sigma \star F_2\sigma)$ for $\star \in \{ \land, \vee, \Rightarrow \}$
|
||||
\item $(\forall X.F)\sigma = \forall X' (F \sigma[X \mapsto X'])$ where $X'$ is a fresh variable (i.e. it does not appear in $F$).
|
||||
\item $(\exists X.F)\sigma = \exists X' (F \sigma[X \mapsto X'])$ where $X'$ is a fresh variable.
|
||||
\end{itemize}
|
||||
|
||||
|
||||
@ -13,8 +13,8 @@ A logic program has the following components (defined using BNF):
|
||||
|
||||
\item[Horn clause] \marginnote{Horn clause}
|
||||
A clause with at most one positive literal.
|
||||
\[ K := A \leftarrow G \]
|
||||
In other words, $A$ and all the literals in $G$ are positive as $A \leftarrow G = A \vee \lnot G$.
|
||||
\[ K := A \Leftarrow G \]
|
||||
In other words, $A$ and all the literals in $G$ are positive as $A \Leftarrow G = A \vee \lnot G$.
|
||||
|
||||
\item[Program] \marginnote{Program}
|
||||
$P := K_1 \dots K_m$ for $m \geq 0$
|
||||
@ -52,14 +52,14 @@ A logic program has the following components (defined using BNF):
|
||||
|
||||
\item[Computed answer substitution] \marginnote{Computed answer substitution}
|
||||
Given a goal $G$ and a program $P$, if there exists a successful derivation
|
||||
$\langle G, \varepsilon \rangle \mapsto* \langle \top, \theta \rangle$,
|
||||
$\langle G, \varepsilon \rangle \mapsto^* \langle \top, \theta \rangle$,
|
||||
then the substitution $\theta$ is the computed answer substitution of $G$.
|
||||
|
||||
\item[Transition] \marginnote{Transition}
|
||||
Starting from the state $\langle A \land G, \theta \rangle$ of a program $P$, a transition on the atom $A$ can result in:
|
||||
\begin{descriptionlist}
|
||||
\item[Unfold]
|
||||
If there exists a clause $(B \leftarrow H)$ in $P$ and
|
||||
If there exists a clause $(B \Leftarrow H)$ in $P$ and
|
||||
a (most general) unifier $\beta$ for $A\theta$ and $B$,
|
||||
then we have a transition: $\langle A \land G, \theta \rangle \mapsto \langle H \land G, \theta\beta \rangle$.
|
||||
|
||||
@ -67,7 +67,7 @@ A logic program has the following components (defined using BNF):
|
||||
To do this, we search for a clause that has as conclusion $A\theta$ and add its premise to the things to prove.
|
||||
If a unification is needed to match $A\theta$, we add it to the substitutions of the state.
|
||||
\item[Failure]
|
||||
If there are no clauses $(B \leftarrow H)$ in $P$ with a unifier for $A\theta$ and $B$,
|
||||
If there are no clauses $(B \Leftarrow H)$ in $P$ with a unifier for $A\theta$ and $B$,
|
||||
then we have a transition: $\langle A \land G, \theta \rangle \mapsto \langle \bot, \varepsilon \rangle$.
|
||||
\end{descriptionlist}
|
||||
|
||||
@ -79,7 +79,7 @@ A logic program has the following components (defined using BNF):
|
||||
This affects the length of the derivation (infinite in the worst case).
|
||||
|
||||
\item[Don't-know] \marginnote{Don't-know}
|
||||
Any clause $(B \rightarrow H)$ in $P$ with an unifier for $A\theta$ and $B$ can be chosen.
|
||||
Any clause $(B \Leftarrow H)$ in $P$ with a unifier for $A\theta$ and $B$ can be chosen.
|
||||
This determines the output of the derivation.
|
||||
\end{descriptionlist}
|
||||
\end{description}
|
||||
@ -101,7 +101,7 @@ A logic program has the following components (defined using BNF):
|
||||
|
||||
\begin{theorem}[Completeness]
|
||||
Given a program $P$, a goal $G$ and a substitution $\theta$,
|
||||
if $P \models G\theta$, then it exists a computed answer substitution $\sigma$ such that $G\theta = G\sigma\beta$.
|
||||
if $P \models G\theta$, then there exists a computed answer substitution $\sigma$ such that $G\theta = G\sigma\beta$.
|
||||
\end{theorem}
|
||||
|
||||
\begin{theorem}
|
||||
|
||||
@ -9,7 +9,7 @@
|
||||
The symbols of propositional logic are:
|
||||
\begin{descriptionlist}
|
||||
\item[Proposition symbols] $p_0$, $p_1$, \dots
|
||||
\item[Connectives] $\land$ $\vee$ $\rightarrow$ $\leftrightarrow$ $\lnot$ $\bot$ $($ $)$
|
||||
\item[Connectives] $\land$ $\vee$ $\Rightarrow$ $\Leftrightarrow$ $\lnot$ $\bot$ $($ $)$
|
||||
\end{descriptionlist}
|
||||
|
||||
\begin{description}
|
||||
@ -22,12 +22,12 @@ The symbols of propositional logic are:
|
||||
\item If $S_1$ and $S_2$ are well-formed, $S_1 \vee S_2$ is well-formed.
|
||||
\end{itemize}
|
||||
|
||||
Note that the implication $S_1 \rightarrow S_2$ can be written as $\lnot S_1 \vee S_2$.
|
||||
Note that the implication $S_1 \Rightarrow S_2$ can be written as $\lnot S_1 \vee S_2$.
|
||||
|
||||
The BNF definition of a formula is:
|
||||
\[
|
||||
F := \texttt{atomic\_proposition} \,|\, F \land F \,|\, F \vee F \,|\,
|
||||
F \rightarrow F \,|\, F \leftrightarrow F \,|\, \lnot F \,|\, (F)
|
||||
F \Rightarrow F \,|\, F \Leftrightarrow F \,|\, \lnot F \,|\, (F)
|
||||
\]
|
||||
% \[
|
||||
% \begin{split}
|
||||
@ -35,8 +35,8 @@ The symbols of propositional logic are:
|
||||
% &\lnot \texttt{<formula>} \,|\, \\
|
||||
% &\texttt{<formula>} \land \texttt{<formula>} \,|\, \\
|
||||
% &\texttt{<formula>} \vee \texttt{<formula>} \,|\, \\
|
||||
% &\texttt{<formula>} \rightarrow \texttt{<formula>} \,|\, \\
|
||||
% &\texttt{<formula>} \leftrightarrow \texttt{<formula>} \,|\, \\
|
||||
% &\texttt{<formula>} \Rightarrow \texttt{<formula>} \,|\, \\
|
||||
% &\texttt{<formula>} \Leftrightarrow \texttt{<formula>} \,|\, \\
|
||||
% &(\texttt{<formula>}) \\
|
||||
% \end{split}
|
||||
% \]
|
||||
@ -64,7 +64,7 @@ The symbols of propositional logic are:
|
||||
to the atoms $\{ A_1, \dots, A_n \}$ an element of $D$.
|
||||
\end{itemize}
|
||||
|
||||
Note: given a formula $F$ of $n$ distinct atoms, there are $2^n$ district interpretations.
|
||||
Note: given a formula $F$ of $n$ distinct atoms, there are $2^n$ distinct interpretations.
|
||||
|
||||
\begin{description}
|
||||
\item[Model] \marginnote{Model}
|
||||
@ -100,14 +100,14 @@ The symbols of propositional logic are:
|
||||
\item $\lnot S$ is true iff $S$ is false.
|
||||
\item $S_1 \land S_2$ is true iff $S_1$ is true and $S_2$ is true.
|
||||
\item $S_1 \vee S_2$ is true iff $S_1$ is true or $S_2$ is true.
|
||||
\item $S_1 \rightarrow S_2$ is true iff $S_1$ is false or $S_2$ is true.
|
||||
\item $S_1 \leftrightarrow S_2$ is true iff $S_1 \rightarrow S_2$ is true and $S_1 \leftarrow S_2$ is true.
|
||||
\item $S_1 \Rightarrow S_2$ is true iff $S_1$ is false or $S_2$ is true.
|
||||
\item $S_1 \Leftrightarrow S_2$ is true iff $S_1 \Rightarrow S_2$ is true and $S_1 \Leftarrow S_2$ is true.
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\item[Evaluation] \marginnote{Evaluation order}
|
||||
The connectives of a propositional formula are evaluated in the order:
|
||||
\[ \leftrightarrow, \rightarrow, \vee, \land, \lnot \]
|
||||
The connectives of a propositional formula are evaluated in the following order:
|
||||
\[ \Leftrightarrow, \Rightarrow, \vee, \land, \lnot \]
|
||||
Formulas in parenthesis have higher priority.
|
||||
|
||||
\item[Logical consequence] \marginnote{Logical consequence}
|
||||
@ -128,9 +128,9 @@ The symbols of propositional logic are:
|
||||
\item[Associativity]: $((P \land Q) \land R) \equiv (P \land (Q \land R))$
|
||||
and $((P \vee Q) \vee R) \equiv (P \vee (Q \vee R))$
|
||||
\item[Double negation elimination]: $\lnot(\lnot P) \equiv P$
|
||||
\item[Contraposition]: $(P \rightarrow Q) \equiv (\lnot Q \rightarrow \lnot P)$
|
||||
\item[Implication elimination]: $(P \rightarrow Q) \equiv (\lnot P \vee Q)$
|
||||
\item[Biconditional elimination]: $(P \leftrightarrow Q) \equiv ((P \rightarrow Q) \land (Q \rightarrow P))$
|
||||
\item[Contraposition]: $(P \Rightarrow Q) \equiv (\lnot Q \Rightarrow \lnot P)$
|
||||
\item[Implication elimination]: $(P \Rightarrow Q) \equiv (\lnot P \vee Q)$
|
||||
\item[Biconditional elimination]: $(P \Leftrightarrow Q) \equiv ((P \Rightarrow Q) \land (Q \Rightarrow P))$
|
||||
\item[De Morgan]: $\lnot(P \land Q) \equiv (\lnot P \vee \lnot Q)$ and $\lnot(P \vee Q) \equiv (\lnot P \land \lnot Q)$
|
||||
\item[Distributivity of $\land$ over $\vee$]: $(P \land (Q \vee R)) \equiv ((P \land Q) \vee (P \land R))$
|
||||
\item[Distributivity of $\vee$ over $\land$]: $(P \vee (Q \land R)) \equiv ((P \vee Q) \land (P \vee R))$
|
||||
@ -179,34 +179,34 @@ The symbols of propositional logic are:
|
||||
\begin{description}
|
||||
\item[Sound] \marginnote{Soundness}
|
||||
A reasoning method $E$ is sound iff:
|
||||
\[ (\Gamma \vdash^E F) \rightarrow (\Gamma \models F) \]
|
||||
\[ (\Gamma \vdash^E F) \Rightarrow (\Gamma \models F) \]
|
||||
|
||||
\item[Complete] \marginnote{Completeness}
|
||||
A reasoning method $E$ is complete iff:
|
||||
\[ (\Gamma \models F) \rightarrow (\Gamma \vdash^E F) \]
|
||||
\[ (\Gamma \models F) \Rightarrow (\Gamma \vdash^E F) \]
|
||||
\end{description}
|
||||
|
||||
\item[Deduction theorem] \marginnote{Deduction theorem}
|
||||
Given a set of formulas $\{ F_1, \dots, F_n \}$ and a formula $G$:
|
||||
\[ (F_1 \land \dots \land F_n) \models G \,\iff\, \models (F_1 \land \dots \land F_n) \rightarrow G \]
|
||||
\[ (F_1 \land \dots \land F_n) \models G \,\iff\, \models (F_1 \land \dots \land F_n) \Rightarrow G \]
|
||||
|
||||
\begin{proof} \phantom{}
|
||||
\begin{description}
|
||||
\item[$\rightarrow$])
|
||||
\item[$\Rightarrow$])
|
||||
By hypothesis $(F_1 \land \dots \land F_n) \models G$.
|
||||
|
||||
So, for each interpretation $\mathcal{I}$ in which $(F_1 \land \dots \land F_n)$ is true,
|
||||
$G$ is also true.
|
||||
Therefore, $\mathcal{I} \models (F_1 \land \dots \land F_n) \rightarrow G$.
|
||||
Therefore, $\mathcal{I} \models (F_1 \land \dots \land F_n) \Rightarrow G$.
|
||||
|
||||
Moreover, for each interpretation $\mathcal{I}'$ in which $(F_1 \land \dots \land F_n)$ is false,
|
||||
$(F_1 \land \dots \land F_n) \rightarrow G$ is true.
|
||||
Therefore, $\mathcal{I}' \models (F_1 \land \dots \land F_n) \rightarrow G$.
|
||||
$(F_1 \land \dots \land F_n) \Rightarrow G$ is true.
|
||||
Therefore, $\mathcal{I}' \models (F_1 \land \dots \land F_n) \Rightarrow G$.
|
||||
|
||||
In conclusion, $\models (F_1 \land \dots \land F_n) \rightarrow G$.
|
||||
In conclusion, $\models (F_1 \land \dots \land F_n) \Rightarrow G$.
|
||||
|
||||
\item[$\leftarrow$])
|
||||
By hypothesis $\models (F_1 \land \dots \land F_n) \rightarrow G$.
|
||||
\item[$\Leftarrow$])
|
||||
By hypothesis $\models (F_1 \land \dots \land F_n) \Rightarrow G$.
|
||||
Therefore, for each interpretation where $(F_1 \land \dots \land F_n)$ is true,
|
||||
$G$ is also true.
|
||||
|
||||
@ -240,9 +240,9 @@ The symbols of propositional logic are:
|
||||
\begin{description}
|
||||
\item[Natural deduction] \marginnote{Natural deduction for propositional logic}
|
||||
Set of rules to introduce or eliminate connectives.
|
||||
We consider a subset $\{ \land, \rightarrow, \bot \}$ of functionally complete connectives.
|
||||
We consider a subset $\{ \land, \Rightarrow, \bot \}$ of functionally complete connectives.
|
||||
|
||||
Natural deduction can be represented using a tree like structure:
|
||||
Natural deduction can be represented using a tree-like structure:
|
||||
\begin{prooftree}
|
||||
\AxiomC{[hypothesis]}
|
||||
\noLine
|
||||
@ -252,12 +252,14 @@ The symbols of propositional logic are:
|
||||
\RightLabel{rule name}\UnaryInfC{conclusion}
|
||||
\end{prooftree}
|
||||
|
||||
The conclusion is true when the hypothesis are able to prove the premise.
|
||||
Another tree can be built on top of premises to prove them.
|
||||
The conclusion is true when the hypotheses can prove the premise.
|
||||
Another tree can be built on top of the premises to prove them.
|
||||
|
||||
\begin{descriptionlist}
|
||||
\item[Introduction] \marginnote{Introduction rules}
|
||||
Usually used to prove the conclusion by splitting it.\\
|
||||
Usually used to prove the conclusion by splitting it.
|
||||
|
||||
Note that $\lnot \psi \equiv (\psi \Rightarrow \bot)$. \\
|
||||
\begin{minipage}{.4\linewidth}
|
||||
\begin{prooftree}
|
||||
\AxiomC{$\psi$}
|
||||
@ -272,7 +274,7 @@ The symbols of propositional logic are:
|
||||
\UnaryInfC{\vdots}
|
||||
\noLine
|
||||
\UnaryInfC{$\psi$}
|
||||
\RightLabel{$\rightarrow$I}\UnaryInfC{$\varphi \rightarrow \psi$}
|
||||
\RightLabel{$\Rightarrow$I}\UnaryInfC{$\varphi \Rightarrow \psi$}
|
||||
\end{prooftree}
|
||||
\end{minipage}
|
||||
|
||||
@ -293,16 +295,18 @@ The symbols of propositional logic are:
|
||||
\begin{minipage}{.3\linewidth}
|
||||
\begin{prooftree}
|
||||
\AxiomC{$\varphi$}
|
||||
\AxiomC{$\varphi \rightarrow \psi$}
|
||||
\RightLabel{$\rightarrow$E}\BinaryInfC{$\psi$}
|
||||
\AxiomC{$\varphi \Rightarrow \psi$}
|
||||
\RightLabel{$\Rightarrow$E}\BinaryInfC{$\psi$}
|
||||
\end{prooftree}
|
||||
\end{minipage}
|
||||
|
||||
\item[Ex falso sequitur quodlibet] \marginnote{Ex falso sequitur quodlibet}
|
||||
From contradiction, anything follows.
|
||||
This can be used when we have two contradicting hypothesis.
|
||||
This can be used when we have two contradicting hypotheses.
|
||||
\begin{prooftree}
|
||||
\AxiomC{$\bot$}
|
||||
\AxiomC{$\psi$}
|
||||
\AxiomC{$\lnot \psi$}
|
||||
\BinaryInfC{$\bot$}
|
||||
\RightLabel{$\bot$}\UnaryInfC{$\varphi$}
|
||||
\end{prooftree}
|
||||
|
||||
|
||||
@ -124,7 +124,7 @@ Given a floating-point system $\mathcal{F}(\beta, t, L, U)$, the representation
|
||||
|
||||
\subsection{Machine precision}
|
||||
Machine precision $\varepsilon_{\text{mach}}$ determines the accuracy of a floating-point system. \marginnote{Machine precision}
|
||||
Depending on the approximation approach, machine precision can be computes as:
|
||||
Depending on the approximation approach, machine precision can be computed as:
|
||||
\begin{descriptionlist}
|
||||
\item[Truncation] $\varepsilon_{\text{mach}} = \beta^{1-t}$
|
||||
\item[Rounding] $\varepsilon_{\text{mach}} = \frac{1}{2}\beta^{1-t}$
|
||||
|
||||
@ -34,11 +34,11 @@ Note that $\max \{ f(x) \} = \min \{ -f(x)$ \}.
|
||||
\subsection{Optimality conditions}
|
||||
|
||||
\begin{description}
|
||||
\item[First order condition] \marginnote{First order condition}
|
||||
\item[First-order condition] \marginnote{First-order condition}
|
||||
Let $f: \mathbb{R}^N \rightarrow \mathbb{R}$ be continuous and differentiable in $\mathbb{R}^N$.
|
||||
\[ \text{If } \vec{x}^* \text{ local minimum of } f \Rightarrow \nabla f(\vec{x}^*) = \nullvec \]
|
||||
|
||||
\item[Second order condition] \marginnote{Second order condition}
|
||||
\item[Second-order condition] \marginnote{Second-order condition}
|
||||
Let $f: \mathbb{R}^N \rightarrow \mathbb{R}$ be continuous and twice differentiable.
|
||||
\[
|
||||
\text{If } \nabla f(\vec{x}^*) = \nullvec \text{ and } \nabla^2 f(\vec{x}^*) \text{ positive definite} \Rightarrow
|
||||
@ -46,7 +46,7 @@ Note that $\max \{ f(x) \} = \min \{ -f(x)$ \}.
|
||||
\]
|
||||
\end{description}
|
||||
|
||||
As the second order condition requires to compute the Hessian matrix, which is expensive, in practice only the first order condition is checked.
|
||||
As the second-order condition requires computing the Hessian matrix, which is expensive, in practice only the first-order condition is checked.
|
||||
|
||||
|
||||
|
||||
@ -147,7 +147,7 @@ A generic gradient-like method can then be defined as:
|
||||
|
||||
\begin{description}
|
||||
\item[Choice of the initialization point] \marginnote{Initialization point}
|
||||
The starting point of an iterative method is a user defined parameter.
|
||||
The starting point of an iterative method is a user-defined parameter.
|
||||
For simple problems, it is usually chosen randomly in $[-1, +1]$.
|
||||
|
||||
For complex problems, the choice of the initialization point is critical as
|
||||
@ -184,9 +184,9 @@ A generic gradient-like method can then be defined as:
|
||||
\item[Difficult topologies]
|
||||
\marginnote{Cliff}
|
||||
A cliff in the objective function causes problems when evaluating the gradient at the edge.
|
||||
With a small step size, there is a slow down in convergence.
|
||||
With a small step size, there is a slowdown in convergence.
|
||||
With a large step size, there is an overshoot that may cause the algorithm to diverge.
|
||||
% a slow down when evaluating
|
||||
% a slowdown when evaluating
|
||||
% the gradient at the edge using a small step size and
|
||||
% an overshoot when the step is too large.
|
||||
|
||||
|
||||
@ -145,7 +145,7 @@ This method has time complexity $O(\frac{n^3}{6})$.
|
||||
\section{Iterative methods}
|
||||
\marginnote{Iterative methods}
|
||||
Iterative methods solve a linear system by computing a sequence that converges to the exact solution.
|
||||
Compared to direct methods, they are less precise but computationally faster and more adapt for large systems.
|
||||
Compared to direct methods, they are less precise but computationally faster and more suited for large systems.
|
||||
|
||||
The overall idea is to build a sequence of vectors $\vec{x}_k$
|
||||
that converges to the exact solution $\vec{x}^*$:
|
||||
@ -192,7 +192,7 @@ Obviously, as the sequence is truncated, a truncation error is introduced when u
|
||||
|
||||
\section{Condition number}
|
||||
Inherent error causes inaccuracies during the resolution of a system.
|
||||
This problem is independent from the algorithm and is estimated using exact arithmetic.
|
||||
This problem is independent of the algorithm and is estimated using exact arithmetic.
|
||||
|
||||
Given a system $\matr{A}\vec{x} = \vec{b}$, we perturbate $\matr{A}$ and/or $\vec{b}$ and study the inherited error.
|
||||
For instance, if we perturbate $\vec{b}$, we obtain the following system:
|
||||
@ -210,8 +210,8 @@ Finally, we can define the \textbf{condition number} of a matrix $\matr{A}$ as:
|
||||
\[ K(\matr{A}) = \Vert \matr{A} \Vert \cdot \Vert \matr{A}^{-1} \Vert \]
|
||||
|
||||
A system is \textbf{ill-conditioned} if $K(\matr{A})$ is large \marginnote{Ill-conditioned}
|
||||
(i.e. a small perturbation of the input causes a large change of the output).
|
||||
Otherwise it is \textbf{well-conditioned}. \marginnote{Well-conditioned}
|
||||
(i.e. a small perturbation of the input causes a large change in the output).
|
||||
Otherwise, it is \textbf{well-conditioned}. \marginnote{Well-conditioned}
|
||||
|
||||
|
||||
\section{Linear least squares problem}
|
||||
|
||||
@ -118,7 +118,7 @@ The parameters are determined as the most likely to predict the correct label gi
|
||||
Moreover, as the dataset is identically distributed,
|
||||
each $p_\vec{\uptheta}(y_n \vert \bm{x}_n)$ of the product has the same distribution.
|
||||
|
||||
By applying the logarithm, we have that the negative log-likelihood of a i.i.d. dataset is defined as:
|
||||
By applying the logarithm, we have that the negative log-likelihood of an i.i.d. dataset is defined as:
|
||||
\[ \mathcal{L}(\vec{\uptheta}) = -\sum_{n=1}^{N} \log p_\vec{\uptheta}(y_n \vert \bm{x}_n) \]
|
||||
and to find good parameters $\vec{\uptheta}$, we solve the problem:
|
||||
\[
|
||||
@ -170,7 +170,7 @@ The parameters are determined as the most likely to predict the correct label gi
|
||||
\begin{subfigure}{.45\textwidth}
|
||||
\centering
|
||||
\includegraphics[width=.75\linewidth]{img/gaussian_mle_bad.png}
|
||||
\caption{When the parameters are bad, the label will be far the mean}
|
||||
\caption{When the parameters are bad, the label will be far from the mean}
|
||||
\end{subfigure}
|
||||
|
||||
\caption{Geometric interpretation of the Gaussian likelihood}
|
||||
@ -223,7 +223,7 @@ we want to estimate the function $f$.
|
||||
|
||||
\begin{description}
|
||||
\item[Model]
|
||||
We use as predictor:
|
||||
We use as the predictor:
|
||||
\[ f(\vec{x}) = \vec{x}^T \vec{\uptheta} \]
|
||||
Because of the noise, we use a probabilistic model with likelihood:
|
||||
\[ p_\vec{\uptheta}(y \,\vert\, \vec{x}) = \mathcal{N}(y \,\vert\, f(\vec{x}), \sigma^2) \]
|
||||
|
||||
@ -322,7 +322,7 @@ Note: sometimes, instead of the full posterior, the maximum is considered (with
|
||||
\item[Expected value (multivariate)] \marginnote{Expected value (multivariate)}
|
||||
A multivariate random variable $X$ can be seen as
|
||||
a vector of univariate random variables $\begin{pmatrix} X_1, \dots, X_D \end{pmatrix}^T$.
|
||||
Its expected value can be computed element wise as:
|
||||
Its expected value can be computed element-wise as:
|
||||
\[
|
||||
\mathbb{E}_X[g(\bm{x})] =
|
||||
\begin{pmatrix} \mathbb{E}_{X_1}[g(x_1)] \\ \vdots \\ \mathbb{E}_{X_D}[g(x_D)] \end{pmatrix} \in \mathbb{R}^D
|
||||
@ -466,7 +466,7 @@ Moreover, we have that:
|
||||
\begin{descriptionlist}
|
||||
\item[Uniform distribution] \marginnote{Uniform distribution}
|
||||
Given a discrete random variable $X$ with $\vert \mathcal{T}_X \vert = N$,
|
||||
$X$ has an uniform distribution if:
|
||||
$X$ has a uniform distribution if:
|
||||
\[ p_X(x) = \frac{1}{N}, \forall x \in \mathcal{T}_X \]
|
||||
|
||||
\item[Poisson distribution] \marginnote{Poisson distribution}
|
||||
|
||||
@ -71,7 +71,7 @@
|
||||
the second matrix contains in the $i$-th row the gradient of $g_i$.
|
||||
|
||||
Therefore, if $g_i$ are in turn multivariate functions $g_1(s, t), g_2(s, t): \mathbb{R}^2 \rightarrow \mathbb{R}$,
|
||||
the chain rule can be applies as follows:
|
||||
the chain rule can be applied as follows:
|
||||
\[
|
||||
\frac{\text{d}f}{\text{d}(s, t)} =
|
||||
\begin{pmatrix}
|
||||
@ -257,7 +257,7 @@ The computation graph can be expressed as:
|
||||
\]
|
||||
where $g_i$ are elementary functions and $x_{\text{Pa}(x_i)}$ are the parent nodes of $x_i$ in the graph.
|
||||
In other words, each intermediate variable is expressed as an elementary function of its preceding nodes.
|
||||
The derivatives of $f$ can then be computed step-by-step going backwards as:
|
||||
The derivatives of $f$ can then be computed step-by-step going backward as:
|
||||
\[ \frac{\partial f}{\partial x_D} = 1 \text{, as by definition } f = x_D \]
|
||||
\[
|
||||
\frac{\partial f}{\partial x_i} = \sum_{\forall x_c: x_i \in \text{Pa}(x_c)} \frac{\partial f}{\partial x_c} \frac{\partial x_c}{\partial x_i}
|
||||
@ -266,7 +266,7 @@ The derivatives of $f$ can then be computed step-by-step going backwards as:
|
||||
where $\text{Pa}(x_c)$ is the set of parent nodes of $x_c$ in the graph.
|
||||
In other words, to compute the partial derivative of $f$ w.r.t. $x_i$,
|
||||
we apply the chain rule by computing
|
||||
the partial derivative of $f$ w.r.t. the variables following $x_i$ in the graph (as the computation goes backwards).
|
||||
the partial derivative of $f$ w.r.t. the variables following $x_i$ in the graph (as the computation goes backward).
|
||||
|
||||
Automatic differentiation is applicable to all functions that can be expressed as a computational graph and
|
||||
when the elementary functions are differentiable.
|
||||
|
||||
Reference in New Issue
Block a user