Top Posts Tagged with #ergm

Popular Recent

Applying the Frailty ERGM (FERGM) Through six substantive applications (MesaHigh, Florentine Marriage, Military Alliances, Militarized Interstate Disputes, Regional Planning, Brain Complexity), we illustrate how unobserved heterogeneity and confounding can lead to degenerate inferential network model specifications and errors, as well as poor model fit. The figure shows the improvements made by accounting for unit-level heterogeneity with the FERGM. The percent improvement is calculated as the difference in the mean accuracy of tie prediction for the FERGM relative to the ERGM across 500 simulations. Across most of these networks, we find a fairly large increase in the percent of ties accurately predicted by using the FERGM.

#social networks #sna #ERGM #frailty terms #political methodology #replication #extensions

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Cohesion

C O N T E N T S:

GENERAL INFO

KEY TOPICS

In this section we first report on the level of cohesion of the European Parliament?s groups by analyzing the co-voting through the agreement and ERGM measures.(More…)

While this has offered a welcome addition to the cohesion literature and a novel micro-sociological method of examining cohesion, its primary evidential base has been combat groups.(More…)

T…

View On WordPress

#AKAA #DNA #Eighth European Parliament #EKDK #ERGM #HMR

The Frailty Exponential Random Graph Model Panels show the (scaled) root mean squared error for coffecient estimates on the vertical axis and the level of the unobserved heterogeneity on the horizontal axis from the Monte Carlo study. Each column shows networks of different sizes. Each row the results for a different covariate. Results for the ERGM are reported as dotted lines and open circles, while the FERGM are shown as solid lines and closed circles. On average, the FERGM performs better than the standard ERGM in the presence of unobserved heterogeneity.

#ERGM #frailty term #heterogeneity #social network analysis #monte carlo

"thanks for putting this reunion together, but the place that was chosen is kind of hard to get to"

"take the train and then taxi"

why do I want to go to this reunion?

#just having some second thoughts #just wow great response guys #ergm #although i could take the bus if i wanted at least partways #...if i could be arsed #and fuck i don't think i can be arsed #even if i drove my bro's car #i'd end up drinking heeeeeaaavily

Exponential random graph models in python

I have had some recent experience estimating exponential random graph models (ERGMs) in python using PyMC, and I thought it could be useful to put some notes about it up here.

For readers unfamiliar with ERGM, it is a modeling framework for network or graph data. An unfortunate fact of statistical inference on networks is that the independence assumptions of ordinary least squares are violated in deep, complex and interconnected ways (one of the core insights of social network analysis is that whether or not I am friends with you is tightly related to whether or not I am friends with your friends). ERGMs attempt to bring a linear modeling approach to network estimation by assuming that these inherent interdependencies depend mostly on network characteristics that a savvy modeler can explicitly specify. If you have reason to believe that the major network dependencies in your data can be controlled for by reciprocity and transitivity, for instance, you can simply include those terms in your model specification and hope (read: assume) that the rest of your errors are independent. While they are not perfect, ERGMs have become a widely accepted method among social network analysts.

Most often, ERGMs are estimated using the ergm package for the R statistical environment. This is a wonderful package put together by many of the top researchers in the field of network inference. Recently I needed to estimate a model that allowed more low-level control of the modelling than this package allowed however, so I turned to PyMC to see if I could implement ERGM estimation myself. PyMC is an invaluable python package that makes Markov-chain Monte-Carlo (MCMC) estimation straightforward and, importantly, very fast to implement. Getting estimates for pretty much any model in PyMC takes barely any more work than just specifying that model, and it’s well designed enough that writing your own step methods (and even sampling algorithms) in python is a snap.

Gagnon & MacRae prison friendship network

As a sanity check on my work, I decided to estimate the same model on the same data with ergm in R and then with PyMC in python. The data I used is the prison friendship network from Gagnon & MacRae1, which I got from the UCINET IV datasets page. I was only interested in directed friendship ties between inmates, so I discarded any information about the prisoners themselves; you can download my cleaned up adjacency matrix.

The Model

I specify a pretty simple model for the network. I use \(Y\) to refer to the adjacency matrix of the network and \(y_{i,j}\) to refer to the \((i,j)\mathrm{th}\) element of \(Y\). The value of \(y_{i,j}\) is one if prisoner \(i\) listed prisoner \(j\) as a friend, and zero otherwise. The ERGM is specified as

\[\mathrm{Pr}(Y|\theta)\propto\exp\left[\theta'\Psi(Y)\right]\]

where \(\theta\) is our vector of coefficients, and \(\Psi(Y)\) is a vector of network statistics on \(Y\). With ERGMs, \(\Psi\) is where the modeler includes all of the statistics she deems relevant to the network ties. I include a few straightforward statistics:

mutual tie count:

The number of pairs in the network for which both of the directed edges \((i,j)\) and \((j,i)\) exist. (How many pairs of prisoners mutually nominated each other as friends)

density:

The total number of edges present in the graph. (How many total friendship nominations were made)

2-instar count:

The total number of nodes with exactly two incoming edges. (How many prisoners were listed as friends by exactly two other prisoners)

3-instar count:

The total number of nodes with exactly three incoming edges. (How many prisoners were listed as friends by exactly three other prisoners)

2-outstar count:

The total number of nodes with exactly two outgoing edges. (How many prisoners listed exactly two other prisoners as friends)

Estimating in R

Estimating this model in R using ergm is dead simple:

library(ergm) # load the data prisonAdj <- matrix(scan('prison.dat'),nrow=67,byrow=T) # create a 'network' object prisonNet <- network(prisonAdj) # estimate the model prisonEst <- ergm(prisonNet ~ mutual+istar(1:3)+ostar(2))

In the final line there, mutual is the mutual edge count, ostar(2) indicates the 2-outstar count, and istar(1:3) tells ergm to include a term for 1-, 2-, and 3-instars. The number of 1-instars is the same thing as the density.

Estimating in python

Estimating the model in python will require a bit more work. A standard result from ERGMs is that the model can be reduced to a simple logistic regression with a case for each edge in the network. In this case, each covariate represents the change in a particular network statistic if the edge in question were present versus not present. Formally, let \(Y_{i,j}^+\) be the network obtained by setting \(y_{i,j}=1\) in \(Y\), and let \(Y_{i,j}^-\) be the network obtained by setting \(y_{i,j}=0\). For a particular network statistic \(\Phi_k(Y)\) the corresponding change statistic for edge \((i,j)\) is defined as \(\phi_k(Y,i,j)=\Phi_k(Y_{i,j}^+)-\Phi_k(Y_{i,j}^-)\).

So for edge \((i,j)\) the change in the mutual edge count, \(\phi_M(Y,i,j)\) would depend on the value of \(y_{j,i}\). If \(y_{j,i}=0\), then holding the rest of the network fixed edge \((i,j)\) would be unable to change the total mutual edge count, so \(\phi(Y,i,j)=0\). If \(y_{j,i}=1\), then \(y_{i,j}\) would determine whether the edge was mutual and \(\phi(Y,i,j)=1\).

The first step in the python implementation, then, is to construct these \(\phi\) covariates for each of our network statistics.

import scipy as sp from scipy.misc import comb from itertools import product from pymc import Normal, Bernoulli, InvLogit, MCMC,deterministic # functions to get delta matrices def mutualDelta(am): return(am.copy().transpose()) def istarDelta(am,k): if k == 1: # if k == 1 then this is just density res = sp.ones(am.shape) return(res) res = sp.zeros(am.shape,dtype=int) n = am.shape[0] for i,j in product(xrange(n),xrange(n)): if i!=j: nin = am[:,j].sum()-am[i,j] res[i,j] = comb(nin,k-1,exact=True) return(res) def ostarDelta(am,k): if k == 1: # if k == 1 then this is just density res = sp.ones(am.shape) return(res) res = sp.zeros(am.shape,dtype=int) n = am.shape[0] for i,j in product(xrange(n),xrange(n)): if i!=j: nin = am[i,:].sum()-am[i,j] res[i,j] = comb(nin,k-1,exact=True) return(res)

The mutualDelta function takes an input adjacency matrix and returns an indicator of whether the reversed tie is present in the network. This is simply the transpose of the original adjacency matrix. The istarDelta and ostarDelta functions are a little more complicated, but in essence return a matrix that how many \(k\)-instars or -outstars depend on each edge in the network.

From here it is a simple matter to define the model in PyMC.

def makeModel(adjMat): # define and name the deltas termDeltas = { 'deltamutual':mutualDelta(adjMat), 'deltaistar1':istarDelta(adjMat,1), 'deltaistar2':istarDelta(adjMat,2), 'deltaistar3':istarDelta(adjMat,3), 'deltaostar2':ostarDelta(adjMat,2) } # create term list with coefficients termList = [] coefs = {} for dName,d in termDeltas.items(): tName = 'theta'+dName[5:] coefs[tName] = Normal(tName,0,0.001,value=sp.rand()-0.5) termList.append(d*coefs[tName]) # get individual edge probabilities @deterministic(trace=False,plot=False) def probs(termList=termList): probs = 1./(1+sp.exp(-1*sum(termList))) probs[sp.diag_indices_from(probs)]= 0 return(probs) # define the outcome as outcome = Bernoulli('outcome',probs,value=adjMat,observed=True) return(locals())

(This is a little bit kludgey with the renaming of the coefficient keys in the coefs dictionary). The definition of probs contains one important step that defines the probability along the diagonal to be zero, because inmates were not allowed to list themselves.

Finally, we load the data and estimate the model:

if __name__ == '__main__': # load the prison data with open('prison.dat','r') as f: rowList = list() for l in f: rowList.append([int(x) for x in l.strip().split(' ')]) adjMat = sp.array(rowList) # make the model as an MCMC object m = makeModel(adjMat) mc = MCMC(m) # estimate mc.sample(30000,1000,50)

From there PyMC handles sampling and collecting traces—remarkably easy to get use.

All the python code is combined in this python script.

Comparison

We can examine the estimates made by ergm and my PyMC approach to verify that the python version is working. Although the two methods are not formally comparable—ergm approximates the maximum-likelihood estimates while PyMC generates samples from a posterior distribution—they should yield relatively similar point estimates and ranges.

Estimates and 95-percent intervals from ergm (top, orange) and PyMC (bottom, red).

The above figure shows the estimates and the comparable 95-percent intervals for the two methods. For each model term the ergm estimate is on top (with an orange dot) and the PyMC result is on the bottom (with a red dot). In the case of ergm the central dot is the maximum-likelihood estimate of the coefficient and the interval is derived from 1.96 times the estimated standard error. For PyMC the bars indicate the 2.5, 50 and 97.5 percentiles of the MCMC sample. As expected, the results are not identical. The point estimates and the ranges differ noticeably for each term. However both methods produce estimates of the same sign, approximate magnitude, and ‘confidence’. The table below lists the point estimates and standard error/deviation for the methods.

statistic ergm (standard error) PyMC (standard deviation) mutual ties 3.51 (0.34) 3.57 (0.19) density -4.09 (0.33) -4.50 (0.28) 2-instars 0.35 (0.16) 0.47 (0.13) 3-instars -0.06 (0.05) -0.10 (0.04) 2-outstars -0.13 (0.09) -0.06 (0.06)

Conclusion

From these tests we can say that PyMC should provide a straightforward alternative to ergm for estimating exponential graph models. The model I used here was deliberately simplistic and arbitrary—I only needed to verify my work. In this case it probably would have been easier to just stick with the ergm estimates and move on.

But there are situations where ergm won’t work quite so easily. If you need to define a specialized network statistic not included in the package, for instance (I believe there is a way to incorporate user-defined network statistics in ergm, though I’ve never tried). Or you might need to extend the model (as I recently did) to include multiple networks or time-dependence. In each of these cases the PyMC approach is a simple matter of defining the ‘delta’ matrices that indicate how individual edges affect the statistics of interest.

Has anybody else had experience using PyMC for network statistics like this? I’d love to hear how you went about it.

MacRae J. (1960). Direct factor analysis of sociometric data. Sociometry, 23, 360-371↩

#pymc #python #ergm #r #mcmc #networks #methods #statistics

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

#ergm

#social networks #sna #ERGM #frailty terms #political methodology #replication #extensions

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Cohesion

C O N T E N T S:

GENERAL INFO

KEY TOPICS

In this section we first report on the level of cohesion of the European Parliament?s groups by analyzing the co-voting through the agreement and ERGM measures.(More…)

While this has offered a welcome addition to the cohesion literature and a novel micro-sociological method of examining cohesion, its primary evidential base has been combat groups.(More…)

T…

View On WordPress

#AKAA #DNA #Eighth European Parliament #EKDK #ERGM #HMR

#ERGM #frailty term #heterogeneity #social network analysis #monte carlo

"thanks for putting this reunion together, but the place that was chosen is kind of hard to get to"

"take the train and then taxi"

why do I want to go to this reunion?

Exponential random graph models in python

I have had some recent experience estimating exponential random graph models (ERGMs) in python using PyMC, and I thought it could be useful to put some notes about it up here.

Gagnon & MacRae prison friendship network

The Model

\[\mathrm{Pr}(Y|\theta)\propto\exp\left[\theta'\Psi(Y)\right]\]

mutual tie count:

The number of pairs in the network for which both of the directed edges \((i,j)\) and \((j,i)\) exist. (How many pairs of prisoners mutually nominated each other as friends)

density:

The total number of edges present in the graph. (How many total friendship nominations were made)

2-instar count:

The total number of nodes with exactly two incoming edges. (How many prisoners were listed as friends by exactly two other prisoners)

3-instar count:

The total number of nodes with exactly three incoming edges. (How many prisoners were listed as friends by exactly three other prisoners)

2-outstar count:

The total number of nodes with exactly two outgoing edges. (How many prisoners listed exactly two other prisoners as friends)

Estimating in R

Estimating this model in R using ergm is dead simple:

Estimating in python

The first step in the python implementation, then, is to construct these \(\phi\) covariates for each of our network statistics.

From here it is a simple matter to define the model in PyMC.

Finally, we load the data and estimate the model:

From there PyMC handles sampling and collecting traces—remarkably easy to get use.

All the python code is combined in this python script.

Comparison

Estimates and 95-percent intervals from ergm (top, orange) and PyMC (bottom, red).

Conclusion

Has anybody else had experience using PyMC for network statistics like this? I’d love to hear how you went about it.

MacRae J. (1960). Direct factor analysis of sociometric data. Sociometry, 23, 360-371↩

#pymc #python #ergm #r #mcmc #networks #methods #statistics

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

#ergm

Trending Tags

Last Seen Tags

#ergm

Trending Tags

Last Seen Tags

#ergm