In the Chain demo we considered chain-structured graph, one of the simplest types of dependency where we could take advantage of the graphical structure to allow efficient inference. In this demo, we consider more complex problem involving tree-structured graphical model. In particular, we consider the case where the arcs in the graph can be arbitrary, as long as there is one, and only one, path between any pair of nodes. For such graphs we can still perform efficient inference by applying generalizations of the methods designed for chain-structured models. This example copies the idea from the Water Turbidity Problem

In order to run this demo, please execute "Demo 1D.exe" tree command

Building The Tree

We build a tree, consisting of 100 nodes with each having 4 states. Function buildTree() returns a random tree-structured graph, filled with initial node and edge potentials. The tree is build such that there is one, and only one, path between any pair of nodes and therefore does not have loops. Nodes with only arc is called a leaf. Function srand(0) assures that the graph structure remains the same during multiple execution of the demo:

using namespace DirectGraphicalModels;
const byte   nStates  = 4;                      // {very safe, safe, unsafe, very unsave}
const size_t nNodes   = 100;                    // number of nodes
const size_t nSources = 1;                      // number of sources
srand(0);
CGraphPairwise graph(nStates);
buildTree(graph);                               // Returns a tree with default potentials

Next, we separate all the nodes into 3 types and assigne them the following labels:

Source: fist nSources leaf nodes,
Tap: remaing leaf nodes and
I (internal structure): all other nodes.

std::vector<size_t>      vParents;
std::vector<size_t>      sources;
std::vector<std::string> labels(nNodes);
for (size_t n = 0; n < nNodes; n++) {
    graph.getParentNodes(n, vParents);
    if (vParents.size() <= 1) {                 // if the node is a leaf
        if (sources.size() < nSources) {
            labels[n] = "Source";               // => it is either a source
            sources.push_back(n);
        } else
            labels[n] = "Tap";                  // => or a tap
    } else
        labels[n] = "I";                        // otherwise it is an internal node
} // n

After the tree was created, its nodes had potentials \((\frac{1}{4}, \frac{1}{4}, \frac{1}{4}, \frac{1}{4})^\top\). Now, we substitute the potentials of the source nodes with the potential \((0.9, 0.09, 0.009, 0.001)^\top\), returned with function getNodePot():

Mat nodePot = getNodePot();
for(size_t n: sources)  
    graph.setNode(n, nodePot); 

The edge potential matrix is represented here as a transition matrix:

edgePot
from\to	st.0	st.1	st.2	st.3
st.0	0.9890	0.0099	0.0010	0.0001
st.1	0.1309	0.8618	0.0066	0.0007
st.2	0.0420	0.0841	0.8682	0.0057
st.3	0.0667	0.0333	0.1667	0.7333

e.g. a probability of switching from state 3 to state 1 is 0.0333. The edge potential is returned with function getEdgePot().

After the tree was created, its edges had symmetric potentials: \(\frac{1}{2}edgePot^\top+\frac{1}{2}edgePot\). Now, we have to re-initialize the edge potentials with the original nonsymmetrical \(edgePot\): set them in the direction from the source nodes to the tap nodes DirectGraphicalModels::CGraphPairwise::setArc(size_t src, size_t dst, const Mat& edgePot):

std::vector<size_t> vChilds;
std::vector<bool>   ifSource(nNodes, false);
std::deque<size_t>  sourceQueue;                // queue with indexes of the source nodes
for(size_t n: sources) {
    ifSource[n] = true;
    sourceQueue.push_back(n);
}
Mat edgePot = getEdgePot();
while (!sourceQueue.empty()) {
    size_t n1 = sourceQueue.front();            // pop the front index of a source node
    sourceQueue.pop_front();
    graph.getChildNodes(n1, vChilds);
    for(size_t n2: vChilds) {
        if (!ifSource[n2]) {                    // if the connected node is not a source
            graph.setArc(n1, n2, edgePot);      // set the potential,
            ifSource[n2] = true;                // mark it as a source
            sourceQueue.push_back(n2);          // and add it to the queue
        }
    }
}

Inference

The number of all possible configurations for our tree is equal to \(4^{100}\approx 1.61\times 10^{60}\), and it is not feasible to calculate joint probabilities for all of these configurations. Nevertheless, DGM has an efficient framework for exact inference in tree-structured graphs DirectGraphicalModels::CInferTree, which is based on the sum-product algorithm:

using namespace DirectGraphicalModels;
CInferTree inferer(graph);
inferer.infer();

Please note, that in case of a single source, its marginal probability is equal to the initial probability.