Title: | Temporal Exponential Random Graph Models by Bootstrapped Pseudolikelihood |
---|---|
Description: | Temporal Exponential Random Graph Models (TERGM) estimated by maximum pseudolikelihood with bootstrapped confidence intervals or Markov Chain Monte Carlo maximum likelihood. Goodness of fit assessment for ERGMs, TERGMs, and SAOMs. Micro-level interpretation of ERGMs and TERGMs. The methods are described in Leifeld, Cranmer and Desmarais (2018), JStatSoft <doi:10.18637/jss.v083.i06>. |
Authors: | Philip Leifeld [aut, cre], Skyler J. Cranmer [ctb], Bruce A. Desmarais [ctb] |
Maintainer: | Philip Leifeld <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.10.12 |
Built: | 2024-10-27 04:50:17 UTC |
Source: | https://github.com/leifeld/btergm |
Temporal Exponential Random Graph Models by Bootstrapped Pseudolikelihood.
Temporal Exponential Random Graph Models (TERGM) estimated by maximum pseudolikelihood with bootstrapped confidence intervals, Markov Chain Monte Carlo maximum likelihood, or Bayesian estimation. Goodness of fit assessment for ERGMs, TERGMs, and SAOMs. Micro-level interpretation of ERGMs and TERGMs.
The btergm package implements TERGMs with MPLE and bootstrapped
confidence intervals (btergm
function), MCMC MLE
(mtergm
function), or Bayesian estimation (tbergm
function). Goodness of fit assessment for ERGMs, TERGMs, SAOMs, and dyadic
independence models is possible with the generic gof
function
and its various methods defined here in the btergm package. New
networks can be simulated from TERGMs using the simulate.btergm
function. The package also implements micro-level interpretation for ERGMs
and TERGMs using the interpret
function. Furthermore, the
package contains the chemnet
and knecht
(T)ERGM
datasets. To display citation information, type citation("btergm")
.
Philip Leifeld, Skyler J. Cranmer, Bruce A. Desmarais
Cranmer, Skyler J., Tobias Heinrich and Bruce A. Desmarais (2014): Reciprocity and the Structural Determinants of the International Sanctions Network. Social Networks 36(1): 5-22. doi:10.1016/j.socnet.2013.01.001.
Desmarais, Bruce A. and Skyler J. Cranmer (2012): Statistical Mechanics of Networks: Estimation and Uncertainty. Physica A 391: 1865–1876. doi:10.1016/j.physa.2011.10.018.
Desmarais, Bruce A. and Skyler J. Cranmer (2010): Consistent Confidence Intervals for Maximum Pseudolikelihood Estimators. Neural Information Processing Systems 2010 Workshop on Computational Social Science and the Wisdom of Crowds.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
Useful links:
Adjust the dimensions of a source object to the dimensions of a target object.
adjust( source, target, remove = TRUE, add = TRUE, value = NA, returnlabels = FALSE )
adjust( source, target, remove = TRUE, add = TRUE, value = NA, returnlabels = FALSE )
source |
A matrix, network, list or data.frame object or a vector which should be adjusted. |
target |
A matrix, network, list or data.frame object or a vector to which the source object is compared with regard to its labels. |
remove |
Should rows and columns that are not present in the target object be removed? |
add |
Should rows and columns that are present in the target object but not in the source object be added to the source object? |
value |
The value to be inserted if a new row or column is added. By
default, new cells are filled with |
returnlabels |
Return a list of added and removed row and column labels rather than the actual matrix, vector, or network object? |
An adjacency matrix (the source
matrix) is compared to another
adjacency matrix (the target
matrix) by matching the row or column
labels. If the target matrix contains rows/columns which are not present in
the source matrix, new rows and columns with the corresponding labels and
NA
values in the cells are inserted into the source matrix. If the
source matrix contains rows/columns which are not present in the target
matrix, these rows and columns are removed from the source matrix. In
addition to adjacency matrices, two-mode matrices, network objects (also with
vertex attributes), and vectors are supported.
Note that it is not necessary to use this function to preprocess any data before estimating a TERGM. The estimation functions in the btergm package call this function repeatedly to mutually adjust all data as needed.
# create sociomatrix a with 13 vertices a to m vertices <- letters[1:13] a <- matrix(rbinom(length(vertices)^2, 1, 0.1), nrow = length(vertices)) rownames(a) <- colnames(a) <- vertices # create matrix b with the same vertices except f and k, but additional n vertices <- c(vertices[-c(6, 11)], "n") b <- matrix(rbinom(length(vertices)^2, 1, 0.1), nrow = length(vertices)) rownames(b) <- colnames(b) <- vertices # check dimensions dim(a) # 13 x 13 dim(b) # 12 x 12 # adjust a to b: add n and fill up with NAs; remove f and k adjust(a, b, add = TRUE, remove = TRUE) ## Not run: # more complex example with additional attributes stored in the network # object; convert a to network object with additional vertex and network # attributes nw <- network(a) vertices <- letters[1:13] nwattrib1 <- matrix(rbinom(length(vertices)^2, 1, 0.1), nrow = length(vertices)) nwattrib2 <- nwattrib1 rownames(nwattrib1) <- colnames(nwattrib1) <- vertices set.network.attribute(nw, "nwattrib1", nwattrib1) set.network.attribute(nw, "nwattrib2", nwattrib2) set.vertex.attribute(nw, "vattrib", 1:length(vertices)) # check presence of the two attributes list.network.attributes(nw) # nwattrib1 and nwattrib2 are listed get.network.attribute(nw, "nwattrib1") # returns sociomatrix with labels get.network.attribute(nw, "nwattrib2") # returns sociomatrix without labels list.vertex.attributes(nw) # vattrib is listed get.vertex.attribute(nw, "vattrib") # returns numeric vector 1:13 # adjust the network including the two attributes nw.adjusted <- adjust(nw, b, add = TRUE, remove = TRUE) as.matrix(nw.adjusted) # note that the order of nodes may have changed get.network.attribute(nw.adjusted, "nwattrib1") # returns adjusted matrix get.network.attribute(nw.adjusted, "nwattrib2") # returns adjusted matrix get.vertex.attribute(nw.adjusted, "vattrib") # returns adjusted vector ## End(Not run)
# create sociomatrix a with 13 vertices a to m vertices <- letters[1:13] a <- matrix(rbinom(length(vertices)^2, 1, 0.1), nrow = length(vertices)) rownames(a) <- colnames(a) <- vertices # create matrix b with the same vertices except f and k, but additional n vertices <- c(vertices[-c(6, 11)], "n") b <- matrix(rbinom(length(vertices)^2, 1, 0.1), nrow = length(vertices)) rownames(b) <- colnames(b) <- vertices # check dimensions dim(a) # 13 x 13 dim(b) # 12 x 12 # adjust a to b: add n and fill up with NAs; remove f and k adjust(a, b, add = TRUE, remove = TRUE) ## Not run: # more complex example with additional attributes stored in the network # object; convert a to network object with additional vertex and network # attributes nw <- network(a) vertices <- letters[1:13] nwattrib1 <- matrix(rbinom(length(vertices)^2, 1, 0.1), nrow = length(vertices)) nwattrib2 <- nwattrib1 rownames(nwattrib1) <- colnames(nwattrib1) <- vertices set.network.attribute(nw, "nwattrib1", nwattrib1) set.network.attribute(nw, "nwattrib2", nwattrib2) set.vertex.attribute(nw, "vattrib", 1:length(vertices)) # check presence of the two attributes list.network.attributes(nw) # nwattrib1 and nwattrib2 are listed get.network.attribute(nw, "nwattrib1") # returns sociomatrix with labels get.network.attribute(nw, "nwattrib2") # returns sociomatrix without labels list.vertex.attributes(nw) # vattrib is listed get.vertex.attribute(nw, "vattrib") # returns numeric vector 1:13 # adjust the network including the two attributes nw.adjusted <- adjust(nw, b, add = TRUE, remove = TRUE) as.matrix(nw.adjusted) # note that the order of nodes may have changed get.network.attribute(nw.adjusted, "nwattrib1") # returns adjusted matrix get.network.attribute(nw.adjusted, "nwattrib2") # returns adjusted matrix get.vertex.attribute(nw.adjusted, "vattrib") # returns adjusted vector ## End(Not run)
Longitudinal international defense alliance network, 1981–2000.
allyNet
is a list of network objects at 20 time points,
1981–2000, containing undirected defense alliance networks. In addition to
the alliance ties, each network object contains three vertex attributes.
cinc
is the "CINC" or Composite Index of National Capability score
(see
https://correlatesofwar.org/data-sets/national-material-capabilities/).
polity
is the "polity score" of each country in the respective year.
Quoting the online description, "the Polity Score captures this regime
authority spectrum on a 21-point scale ranging from -10 (hereditary
monarchy) to +10 (consolidated democracy)," (see
https://www.systemicpeace.org/polityproject.html). year
is
simply the year recorded as a vertex attribute.
contigMat
is a 164 x 164 binary matrix in which a 1 indicates that two countries share a border.
lNet
is a list of 20 matrices. Each element is the adjacency matrix from the previous year. This is used to model memory in the ties.
LSP
is a list of 20 matrices. Each element is a matrix recording the number of shared partners between countries in the alliance network from the previous year.
warNet
is a list of 20 matrices. Each element is a binary matrix that indicates whether two states were in a militarized interstate dispute in the respective year.
The alliances dataset contains the international defense alliance network among 164 countries, covering the years 1981–2000. In addition to the yearly defense alliance network, it contains data on military capabilities, governing regime type, geographic contiguity and international conflict. This is an excerpt from a dataset that has been used in two published analyses. The full dataset (Cranmer, Desmarais and Menninga 2012; Cranmer, Desmarais and Kirkland 2012) contains a large number of countries and a much longer time series.
The data were gathered by Skyler Cranmer and Bruce Desmarais in the process of writing Cranmer, Desmarais and Menninga (2012) and Cranmer, Desmarais and Kirkland (2012).
Permission to redistribute this dataset along with this package was granted by Skyler Cranmer and Bruce Desmarais on December 15, 2015. Questions about the data should be directed to them.
Cranmer, Skyler J., Bruce A. Desmarais, and Justin H. Kirkland (2012): Toward a Network Theory of Alliance Formation. International Interactions 38(3): 295–324. doi:10.1080/03050629.2012.677741.
Cranmer, Skyler J., Bruce A. Desmarais, and Elizabeth Menninga (2012): Complex Dependencies in the Alliance Network. International Interactions 29(3): 279–313. doi:10.1177/0738894212443446.
## Not run: data("alliances") # btergm formulas look very similar to ERGM formulas. # Note the R argument; usually want R > 1000. # Here it is set to 50 to limit computation time. # First, set the seed for replicability. set.seed(123) model <- btergm(allyNet ~ edges + gwesp(0, fixed = TRUE) + edgecov(lNet) + edgecov(LSP) + edgecov(warNet) + nodecov("polity") + nodecov("cinc") + absdiff("polity") + absdiff("cinc") + edgecov(contigMat) + nodecov("year"), R = 50) # View estimates and confidence intervals. summary(model) # Evaluate model fit. Simulate 100 networks for each time point. # Calculate edgewise shared partners, degree and geodesic distance # distance distributions. alliance_gof <- gof(model, statistics = c(deg, esp, geodesic)) # Plot goodness of fit. plot(alliance_gof) ## End(Not run)
## Not run: data("alliances") # btergm formulas look very similar to ERGM formulas. # Note the R argument; usually want R > 1000. # Here it is set to 50 to limit computation time. # First, set the seed for replicability. set.seed(123) model <- btergm(allyNet ~ edges + gwesp(0, fixed = TRUE) + edgecov(lNet) + edgecov(LSP) + edgecov(warNet) + nodecov("polity") + nodecov("cinc") + absdiff("polity") + absdiff("cinc") + edgecov(contigMat) + nodecov("year"), R = 50) # View estimates and confidence intervals. summary(model) # Evaluate model fit. Simulate 100 networks for each time point. # Calculate edgewise shared partners, degree and geodesic distance # distance distributions. alliance_gof <- gof(model, statistics = c(deg, esp, geodesic)) # Plot goodness of fit. plot(alliance_gof) ## End(Not run)
Estimate a TERGM by MPLE with temporal bootstrapping.
btergm( formula, R = 500, offset = FALSE, returndata = FALSE, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, control.ergm = NULL, usefastglm = FALSE, verbose = TRUE, ... )
btergm( formula, R = 500, offset = FALSE, returndata = FALSE, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, control.ergm = NULL, usefastglm = FALSE, verbose = TRUE, ... )
formula |
Formula for the TERGM. Model construction works like in the
ergm package with the same model terms etc. (for a list of terms, see
|
R |
Number of bootstrap replications. The higher the number of replications, the more accurate but also the slower is the estimation. |
offset |
If |
returndata |
Return the processed input data instead of estimating and
returning the model? In the |
parallel |
Use multiple cores in a computer or nodes in a cluster to
speed up bootstrapping computations. The default value |
ncpus |
The number of CPU cores used for parallel computing (only if
|
cl |
An optional parallel or snow cluster for use if
|
control.ergm |
ergm controls for |
usefastglm |
Controls whether to use the |
verbose |
Print details about data preprocessing and estimation settings. |
... |
Further arguments to be handed over to the
|
The btergm
function computes temporal exponential random graph models
(TERGM) by bootstrapped pseudolikelihood, as described in Desmarais and
Cranmer (2012). It is faster than MCMC-MLE but only asymptotically unbiased
the longer the time series of networks because it uses temporal bootstrapping
to correct the standard errors.
Philip Leifeld, Skyler J. Cranmer, Bruce A. Desmarais
Cranmer, Skyler J., Tobias Heinrich and Bruce A. Desmarais (2014): Reciprocity and the Structural Determinants of the International Sanctions Network. Social Networks 36(1): 5-22. doi:10.1016/j.socnet.2013.01.001.
Desmarais, Bruce A. and Skyler J. Cranmer (2012): Statistical Mechanics of Networks: Estimation and Uncertainty. Physica A 391: 1865–1876. doi:10.1016/j.physa.2011.10.018.
Desmarais, Bruce A. and Skyler J. Cranmer (2010): Consistent Confidence Intervals for Maximum Pseudolikelihood Estimators. Neural Information Processing Systems 2010 Workshop on Computational Social Science and the Wisdom of Crowds.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2017): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1-36. doi:10.18637/jss.v083.i06.
set.seed(5) networks <- list() for (i in 1:10) { # create 10 random networks with 10 actors mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network::network(mat) # create network object networks[[i]] <- nw # add network to the list } covariates <- list() for (i in 1:10) { # create 10 matrices as covariate mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100) summary(fit) # show estimation results # For examples with real data, see help("knecht") or help("alliances"). # Examples for parallel processing: # Some preliminaries: # - "Forking" means running the code on multiple cores in the same # computer. It's fast but consumes a lot of memory because all # objects are copied for each node. It's also restricted to # cores within a physical computer, i.e. no distribution over a # network or cluster. Forking does not work on Windows systems. # - "MPI" is a protocol for distributing computations over many # cores, often across multiple physical computers/nodes. MPI # is fast and can distribute the work across hundreds of nodes # (but remember that R can handle a maximum of 128 connections, # which includes file access and parallel connections). However, # it requires that the Rmpi package is installed and that an MPI # server is running (e.g., OpenMPI). # - "PSOCK" is a TCP-based protocol. It can also distribute the # work to many cores across nodes (like MPI). The advantage of # PSOCK is that it can as well make use of multiple nodes within # the same node or desktop computer (as with forking) but without # consuming too much additional memory. However, the drawback is # that it is not as fast as MPI or forking. # The following code provides examples for these three scenarios. # btergm works with clusters via the parallel package. That is, the # user can create a cluster object (of type "PSOCK", "MPI", or # "FORK") and supply it to the 'cl' argument of the 'btergm' # function. If no cluster object is provided, btergm will try to # create a temporary PSOCK cluster (if parallel = "snow") or it # will use forking (if parallel = "multicore"). ## Not run: # To use a PSOCK cluster without providing an explicit cluster # object: require("parallel") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = 25) # Equivalently, a PSOCK cluster can be provided as follows: require("parallel") cores <- 25 cl <- makeCluster(cores, type = "PSOCK") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = cores, cl = cl) stopCluster(cl) # Forking (without supplying a cluster object) can be used as # follows. require("parallel") cores <- 25 fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "multicore", ncpus = cores) stopCluster(cl) # Forking (by providing a cluster object) works as follows: require("parallel") cores <- 25 cl <- makeCluster(cores, type = "FORK") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = cores, cl = cl) stopCluster(cl) # To use MPI, a cluster object MUST be created beforehand. In # this example, a MOAB HPC server is used. It stores the number of # available cores as a system option: require("parallel") cores <- as.numeric(Sys.getenv("MOAB_PROCCOUNT")) cl <- makeCluster(cores, type = "MPI") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = cores, cl = cl) stopCluster(cl) # In the following example, the Rmpi package is used to create a # cluster. This may not work on all systems; consult your local # support staff or the help files on your HPC server to find out how # to create a cluster object on your system. # snow/Rmpi start-up if (!is.loaded("mpi_initialize")) { library("Rmpi") } library(snow); mpirank <- mpi.comm.rank (0) if (mpirank == 0) { invisible(makeMPIcluster()) } else { sink (file="/dev/null") invisible(slaveLoop (makeMPImaster())) mpi.finalize() q() } # End snow/Rmpi start-up cl <- getMPIcluster() fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = 25, cl = cl) ## End(Not run)
set.seed(5) networks <- list() for (i in 1:10) { # create 10 random networks with 10 actors mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network::network(mat) # create network object networks[[i]] <- nw # add network to the list } covariates <- list() for (i in 1:10) { # create 10 matrices as covariate mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100) summary(fit) # show estimation results # For examples with real data, see help("knecht") or help("alliances"). # Examples for parallel processing: # Some preliminaries: # - "Forking" means running the code on multiple cores in the same # computer. It's fast but consumes a lot of memory because all # objects are copied for each node. It's also restricted to # cores within a physical computer, i.e. no distribution over a # network or cluster. Forking does not work on Windows systems. # - "MPI" is a protocol for distributing computations over many # cores, often across multiple physical computers/nodes. MPI # is fast and can distribute the work across hundreds of nodes # (but remember that R can handle a maximum of 128 connections, # which includes file access and parallel connections). However, # it requires that the Rmpi package is installed and that an MPI # server is running (e.g., OpenMPI). # - "PSOCK" is a TCP-based protocol. It can also distribute the # work to many cores across nodes (like MPI). The advantage of # PSOCK is that it can as well make use of multiple nodes within # the same node or desktop computer (as with forking) but without # consuming too much additional memory. However, the drawback is # that it is not as fast as MPI or forking. # The following code provides examples for these three scenarios. # btergm works with clusters via the parallel package. That is, the # user can create a cluster object (of type "PSOCK", "MPI", or # "FORK") and supply it to the 'cl' argument of the 'btergm' # function. If no cluster object is provided, btergm will try to # create a temporary PSOCK cluster (if parallel = "snow") or it # will use forking (if parallel = "multicore"). ## Not run: # To use a PSOCK cluster without providing an explicit cluster # object: require("parallel") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = 25) # Equivalently, a PSOCK cluster can be provided as follows: require("parallel") cores <- 25 cl <- makeCluster(cores, type = "PSOCK") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = cores, cl = cl) stopCluster(cl) # Forking (without supplying a cluster object) can be used as # follows. require("parallel") cores <- 25 fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "multicore", ncpus = cores) stopCluster(cl) # Forking (by providing a cluster object) works as follows: require("parallel") cores <- 25 cl <- makeCluster(cores, type = "FORK") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = cores, cl = cl) stopCluster(cl) # To use MPI, a cluster object MUST be created beforehand. In # this example, a MOAB HPC server is used. It stores the number of # available cores as a system option: require("parallel") cores <- as.numeric(Sys.getenv("MOAB_PROCCOUNT")) cl <- makeCluster(cores, type = "MPI") fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = cores, cl = cl) stopCluster(cl) # In the following example, the Rmpi package is used to create a # cluster. This may not work on all systems; consult your local # support staff or the help files on your HPC server to find out how # to create a cluster object on your system. # snow/Rmpi start-up if (!is.loaded("mpi_initialize")) { library("Rmpi") } library(snow); mpirank <- mpi.comm.rank (0) if (mpirank == 0) { invisible(makeMPIcluster()) } else { sink (file="/dev/null") invisible(slaveLoop (makeMPImaster())) mpi.finalize() q() } # End snow/Rmpi start-up cl <- getMPIcluster() fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100, parallel = "snow", ncpus = 25, cl = cl) ## End(Not run)
An S4 class to represent a fitted TERGM by bootstrapped MPLE.
Show the coefficients of a btergm
object.
## S4 method for signature 'btergm' show(object) ## S4 method for signature 'btergm' coef(object, invlogit = FALSE, ...) ## S4 method for signature 'btergm' nobs(object) btergm.se(object, print = FALSE) ## S4 method for signature 'btergm' confint(object, parm, level = 0.95, type = "perc", invlogit = FALSE, ...) timesteps.btergm(object) ## S4 method for signature 'btergm' summary(object, level = 0.95, type = "perc", invlogit = FALSE, ...)
## S4 method for signature 'btergm' show(object) ## S4 method for signature 'btergm' coef(object, invlogit = FALSE, ...) ## S4 method for signature 'btergm' nobs(object) btergm.se(object, print = FALSE) ## S4 method for signature 'btergm' confint(object, parm, level = 0.95, type = "perc", invlogit = FALSE, ...) timesteps.btergm(object) ## S4 method for signature 'btergm' summary(object, level = 0.95, type = "perc", invlogit = FALSE, ...)
object |
A |
invlogit |
Apply inverse logit transformation to the estimates and/or
confidence intervals? That is, |
... |
Further arguments to be passed through to the |
print |
Should the formatted coefficient table be printed to the R
console along with significance stars ( |
parm |
Parameters (specified by integer position or character string). |
level |
The significance level for computation of the confidence
intervals. The default is |
type |
Type of confidence interval, e.g., basic bootstrap interval
( |
btergm
objects result from the estimation of a bootstrapped TERGM via
the btergm
function. btergm
objects contain the
coefficients, the bootstrapping samples of the coefficients, the number of
replications, the number of observations, the number of time steps, the
original formula, and the response, effects and weights objects that were fed
into the glm
call for estimating the model.
coef(btergm)
: Return the coefficients of a btergm
object.
nobs(btergm)
: Return the number of observations saved in a
btergm
object.
btergm.se()
: Create a coefficient table from a btergm
object
Create a coefficient matrix with standard errors and p-values.
This function can create a coefficient matrix with coefficients, standard
errors, z-scores, and p-values, based on a fitted btergm
object.
If the argument print = TRUE
is used, the matrix is printed to the R
console as a formatted coefficient matrix with significance stars instead.
Note that confidence intervals are the preferred way of interpretation for
bootstrapped TERGMs; standard errors are only accurate if the bootstrapped
data are normally distributed, which is not always the case. Various methods
for checking for normality for each model term are available, for example
quantile-quantile plots (e.g., qqnorm(x@boot$t[, 1])
for the first
model term in the btergm
object called x
).
confint(btergm)
: Return the confidence intervals for estimates in a
btergm
object.
timesteps.btergm()
: Return the number of time steps saved in a
btergm
object.
summary(btergm)
: Summary of a fitted btergm
object.
coef
Object of class "numeric"
. The coefficients.
boot
Object of class "matrix"
. The bootstrapping sample.
R
Object of class "numeric"
. Number of replications.
nobs
Object of class "numeric"
. Number of observations.
time.steps
Object of class "numeric"
. Number of time steps.
formula
Object of class "formula"
. The original model formula
(without indices for the time steps).
formula2
The revised formula with the object references after applying
the tergmprepare
function.
response
Object of class "integer"
. The response variable.
effects
Object of class "data.frame"
. The effects that went
into the glm
call.
weights
Object of class "integer"
. The weights of the
observations.
auto.adjust
Object of class "logical"
. Indicates whether
automatic adjustment of dimensions was done before estimation.
offset
Object of class "logical"
. Indicates whether an offset
matrix with structural zeros was used.
directed
Object of class "logical"
. Are the dependent networks
directed?
bipartite
Object of class "logical"
. Are the dependent networks
bipartite?
nvertices
Number of vertices.
data
The data after processing by the tergmprepare
function.
Other tergm-classes:
createBtergm()
,
createMtergm()
,
createTbergm()
,
mtergm-class
,
tbergm-class
Check for degeneracy in fitted TERGMs.
checkdegeneracy(object, ...) ## S4 method for signature 'mtergm' checkdegeneracy(object, ...) ## S4 method for signature 'btergm' checkdegeneracy( object, nsim = 1000, MCMC.interval = 1000, MCMC.burnin = 10000, verbose = FALSE ) ## S3 method for class 'degeneracy' print( x, center = FALSE, t = 1:length(x$sim), terms = 1:length(x$target.stats[[1]]), ... ) ## S3 method for class 'degeneracy' plot( x, center = TRUE, t = 1:length(x$sim), terms = 1:length(x$target.stats[[1]]), vbar = TRUE, main = NULL, xlab = NULL, target.col = "red", target.lwd = 3, ... )
checkdegeneracy(object, ...) ## S4 method for signature 'mtergm' checkdegeneracy(object, ...) ## S4 method for signature 'btergm' checkdegeneracy( object, nsim = 1000, MCMC.interval = 1000, MCMC.burnin = 10000, verbose = FALSE ) ## S3 method for class 'degeneracy' print( x, center = FALSE, t = 1:length(x$sim), terms = 1:length(x$target.stats[[1]]), ... ) ## S3 method for class 'degeneracy' plot( x, center = TRUE, t = 1:length(x$sim), terms = 1:length(x$target.stats[[1]]), vbar = TRUE, main = NULL, xlab = NULL, target.col = "red", target.lwd = 3, ... )
object |
A |
... |
Arbitrary further arguments for subroutines. |
nsim |
The number of networks to be simulated at each time step. This number should be sufficiently large for a meaningful comparison. If possible, much more than 1,000 simulations. |
MCMC.interval |
Internally, this package uses the simulation facilities
of the ergm package to create new networks against which to compare
the original network(s) for goodness-of-fit assessment. This argument sets
the MCMC interval to be passed over to the simulation command. The default
value is |
MCMC.burnin |
Internally, this package uses the simulation facilities of
the ergm package to create new networks against which to compare the
original network(s) for goodness-of-fit assessment. This argument sets the
MCMC burnin to be passed over to the simulation command. The default value
is |
verbose |
Print details? |
x |
A |
center |
If |
t |
Time indices to include, e.g., |
terms |
Indices of the model terms to include, e.g., |
vbar |
Show vertical bar for target statistic in histogram. |
main |
Main title of the plot. |
xlab |
Label on the x-axis. Defaults to the name of the statistic. |
target.col |
Color of the vertical bar for the target statistic. Defaults to red. |
target.lwd |
Line width of the vertical bar for the target statistic. Defaults to 3. |
The methods for the generic degeneracy
function implement a degeneracy
check for btergm
and mtergm
objects. For btergm
, this
works by comparing the global statistics of simulated networks to those of
the observed networks at each observed time step. If the global statistics
differ significantly, this is indicated by small p-values. If there are many
significant results, this indicates degeneracy. For mtergm
, the
mcmc.diagnostics
function from the ergm package is used.
A list with target statistics and simulations.
Hanneke, Steve, Wenjie Fu and Eric P. Xing (2010): Discrete Temporal Models of Social Networks. Electronic Journal of Statistics 4: 585–605. doi:10.1214/09-EJS548.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1-36. doi:10.18637/jss.v083.i06.
German Toxic Chemicals Policy Network in the 1980s (Volker Schneider).
pol
is a directed 30 x 30 adjancency matrix indicating which
row actor sends political/strategic information to which column actor.
1
indicates an information exchange tie, and 0
indicates the
absence of a network tie.
scito
is a directed 30 x 30 adjacency matrix indicating which
row actor sends technical/scientific information to which column actor.
1
indicates an information exchange tie, and 0
indicates the
absence of a network tie. In contrast to political/strategic information
exchange, two separate survey questions were asked about
technical/scientific information exchange: sending information, and
receiving information. The two matrices contain the same relation but one
time from the sender's perspective and one time from the receiver's
perspective. By combining the two matrices, one can create a "confirmed"
technical/scientific information exchange relation. The scito
matrix
contains ties from the sender's perspective.
scifrom
is a directed 30 x 30 adjacency matrix indicating
which row actor receives technical/scientific information from which column
actor. 1
indicates an information exchange tie, and 0
indicates the absence of a network tie. In contrast to political/strategic
information exchange, two separate survey questions were asked about
technical/scientific information exchange: sending information, and
receiving information. The two matrices contain the same relation but one
time from the sender's perspective and one time from the receiver's
perspective. By combining the two matrices, one can create a "confirmed"
technical/scientific information exchange relation. The scifrom
matrix contains ties from the receiver's perspective.
infrep
is a directed 30 x 30 adjancency matrix indicating
which row actor deems which column actor "particularly influential".
1
indicates such a tie, and 0
indicates the absence of an
influence attribution tie.
committee
is a 30 x 20 two-mode (bipartite) network matrix
indicating which row actor is a member of which policy committee/forum (as
indicated by the column labels). 1
indicates a membership tie, and
0
indicates non-membership.
types
is a one-column data.frame where the type
variable contains the actor type of each node. The following values are
possible:
gov
(government actor, e.g., a federal ministry)
ig
(interest group)
io
(international organization)
par
(political party)
sci
(scientific organization)
intpos
is a 30 x 6 matrix containing the interest positions of
the 30 political actors on the six most salient political issues related to a
pending new chemicals law. -1
indicates a negative stance, i.e., the
actor rejects the proposal; 1
indicates a positive stance, i.e., the
actor supports the proposal; and 0
indicates a neutral or absent
opinion.
The chemnet dataset contains network and attribute data and for the 30 most influential political actors with regard to toxic chemicals regulation in Germany in 1983/1984. While the original dataset contains up to 47 actors, this dataset contains the "complete influence core" of mutually relevant actors. The data are cross-sectional. There are no missing data; the response rate was 100 percent. Volker Schneider (University of Konstanz) collected this dataset for his dissertation (Schneider 1988). The dataset was later re-used for a journal publication on information exchange in policy networks (Leifeld and Schneider 2012).
The chemnet dataset contains network relations on political/strategic and technical/scientific information exchange, influence attribution, and membership in policy committees/forums, as well as nodal attributes on the actor type and opinions about the six most salient issues related to the political process that was leading to a new chemicals law at the time being.
The data were collected using paper-based questionnaires. The questionnaires were administered in personal interviews (PAPI). Further information, including the actual survey, data on additional actors, the full names of the policy committees/forums, and the full list of unabbreviated actor names can be found online at http://hdl.handle.net/1902.1/17004 in the replication archive of Leifeld and Schneider (2012).
Replication archive: http://hdl.handle.net/1902.1/17004
AJPS publication: doi:10.1111/j.1540-5907.2011.00580.x
The dataset is publicly available. Questions about the data or the original study should be directed to Volker Schneider <[email protected]>, the author of the original study and person who collected the data.
Leifeld, Philip and Volker Schneider (2012): Information Exchange in Policy Networks. American Journal of Political Science 53(3): 731–744. doi:10.1111/j.1540-5907.2011.00580.x.
Schneider, Volker (1988): Politiknetzwerke der Chemikalienkontrolle. Eine Analyse einer transnationalen Politikentwicklung. Walter de Gruyter: Berlin/New York.
Schneider, Volker and Philip Leifeld (2009): Ueberzeugungssysteme, Diskursnetzwerke und politische Kommunikation: Ein zweiter Blick auf die deutsche Chemikalienkontrolle der 1980er Jahre. In: Volker Schneider, Frank Janning, Philip Leifeld and Thomas Malang (editors): Politiknetzwerke. Modelle, Anwendungen und Visualisierungen. Pages 139–158. Wiesbaden: VS Verlag fuer Sozialwissenschaften.
## Not run: # Replication code for Leifeld and Schneider (2012), AJPS. # Note that the estimates can only be reproduced approximately # due to internal changes in the statnet package. # preparatory steps library("network") library("sna") library("ergm") library("btergm") library("texreg") seed <- 12345 set.seed(seed) data("chemnet") # create confirmed network relation sci <- scito * t(scifrom) # equation 1 in the AJPS paper prefsim <- dist(intpos, method = "euclidean") # equation 2 prefsim <- max(prefsim) - prefsim # equation 3 prefsim <- as.matrix(prefsim) committee <- committee %*% t(committee) # equation 4 diag(committee) <- 0 # the diagonal has no meaning types <- types[, 1] # convert to vector # create network objects and store attributes nw.pol <- network(pol) # political/stratgic information exchange set.vertex.attribute(nw.pol, "orgtype", types) set.vertex.attribute(nw.pol, "betweenness", betweenness(nw.pol)) # centrality nw.sci <- network(sci) # technical/scientific information exchange set.vertex.attribute(nw.sci, "orgtype", types) set.vertex.attribute(nw.sci, "betweenness", betweenness(nw.sci)) # centrality # ERGM: model 1 in the AJPS paper; only preference similarity model1 <- ergm(nw.pol ~ edges + edgecov(prefsim), control = control.ergm(seed = seed)) summary(model1) # ERGM: model 2 in the AJPS paper; complete model model2 <- ergm(nw.pol ~ edges + edgecov(prefsim) + mutual + nodemix("orgtype", base = -7) + nodeifactor("orgtype", base = -1) + nodeofactor("orgtype", base = -5) + edgecov(committee) + edgecov(nw.sci) + edgecov(infrep) + gwesp(0.1, fixed = TRUE) + gwdsp(0.1, fixed = TRUE), control = control.ergm(seed = seed) ) summary(model2) # ERGM: model 3 in the AJPS paper; only preference similarity model3 <- ergm(nw.sci ~ edges + edgecov(prefsim), control = control.ergm(seed = seed)) summary(model3) # ERGM: model 4 in the AJPS paper; complete model model4 <- ergm(nw.sci ~ edges + edgecov(prefsim) + mutual + nodemix("orgtype", base = -7) + nodeifactor("orgtype", base = -1) + nodeofactor("orgtype", base = -5) + edgecov(committee) + edgecov(nw.pol) + edgecov(infrep) + gwesp(0.1, fixed = TRUE) + gwdsp(0.1, fixed = TRUE), control = control.ergm(seed = seed) ) summary(model4) # regression table using the texreg package screenreg(list(model1, model2, model3, model4)) # goodness of fit using the btergm package gof2 <- gof(model2, roc = FALSE, pr = FALSE) gof2 # print gof output plot(gof2) # visual inspection of GOF gof4 <- gof(model4, roc = FALSE, pr = FALSE) gof4 plot(gof4) # MCMC diagnostics pdf("diagnostics2.pdf") mcmc.diagnostics(model2) dev.off() pdf("diagnostics4.pdf") mcmc.diagnostics(model4) dev.off() ## End(Not run)
## Not run: # Replication code for Leifeld and Schneider (2012), AJPS. # Note that the estimates can only be reproduced approximately # due to internal changes in the statnet package. # preparatory steps library("network") library("sna") library("ergm") library("btergm") library("texreg") seed <- 12345 set.seed(seed) data("chemnet") # create confirmed network relation sci <- scito * t(scifrom) # equation 1 in the AJPS paper prefsim <- dist(intpos, method = "euclidean") # equation 2 prefsim <- max(prefsim) - prefsim # equation 3 prefsim <- as.matrix(prefsim) committee <- committee %*% t(committee) # equation 4 diag(committee) <- 0 # the diagonal has no meaning types <- types[, 1] # convert to vector # create network objects and store attributes nw.pol <- network(pol) # political/stratgic information exchange set.vertex.attribute(nw.pol, "orgtype", types) set.vertex.attribute(nw.pol, "betweenness", betweenness(nw.pol)) # centrality nw.sci <- network(sci) # technical/scientific information exchange set.vertex.attribute(nw.sci, "orgtype", types) set.vertex.attribute(nw.sci, "betweenness", betweenness(nw.sci)) # centrality # ERGM: model 1 in the AJPS paper; only preference similarity model1 <- ergm(nw.pol ~ edges + edgecov(prefsim), control = control.ergm(seed = seed)) summary(model1) # ERGM: model 2 in the AJPS paper; complete model model2 <- ergm(nw.pol ~ edges + edgecov(prefsim) + mutual + nodemix("orgtype", base = -7) + nodeifactor("orgtype", base = -1) + nodeofactor("orgtype", base = -5) + edgecov(committee) + edgecov(nw.sci) + edgecov(infrep) + gwesp(0.1, fixed = TRUE) + gwdsp(0.1, fixed = TRUE), control = control.ergm(seed = seed) ) summary(model2) # ERGM: model 3 in the AJPS paper; only preference similarity model3 <- ergm(nw.sci ~ edges + edgecov(prefsim), control = control.ergm(seed = seed)) summary(model3) # ERGM: model 4 in the AJPS paper; complete model model4 <- ergm(nw.sci ~ edges + edgecov(prefsim) + mutual + nodemix("orgtype", base = -7) + nodeifactor("orgtype", base = -1) + nodeofactor("orgtype", base = -5) + edgecov(committee) + edgecov(nw.pol) + edgecov(infrep) + gwesp(0.1, fixed = TRUE) + gwdsp(0.1, fixed = TRUE), control = control.ergm(seed = seed) ) summary(model4) # regression table using the texreg package screenreg(list(model1, model2, model3, model4)) # goodness of fit using the btergm package gof2 <- gof(model2, roc = FALSE, pr = FALSE) gof2 # print gof output plot(gof2) # visual inspection of GOF gof4 <- gof(model4, roc = FALSE, pr = FALSE) gof4 plot(gof4) # MCMC diagnostics pdf("diagnostics2.pdf") mcmc.diagnostics(model2) dev.off() pdf("diagnostics4.pdf") mcmc.diagnostics(model4) dev.off() ## End(Not run)
Swiss political science co-authorship network 2013
ch_coaut
is an undirected, weighted 156 x 156 adjancency matrix indicating how many publications political scientist in Switzerland shared with each other as reported in late 2013, including only postdoctoral and professorial political scientists affiliated with research institutes or universities. The exact edge weight should be treated with caution because some publications were counted multiple times because they were reported by multiple co-authors. The diagonal contains the number of publications of the respective author. Leifeld and Ingold (2016) describe the data collection process in more detail.
ch_nodeattr
is a data frame with node attributes/variables for the 156 researchers, in the same alphabetical row order as the network matrix. The first twelve columns with column labels starting with "inst_" are affiliations with different institutions (1 = affiliated; 0 = no affiliation). The next seven columns with column labels starting with "city_" contain the locations of the researchers' institutional affiliations. The "phdyear" column contains the self-reported year of obtaining the PhD, and the "birthyear" column contains the self-reported or publicly available year of birth; these two variables contain many missing values. The "status" column indicates whether a researcher was listed as a professor or as having postdoctoral or other non-professorial status at the time. "chairtitle" is the name of the chair or research group the researcher reported to be a member of. "num_publications" is the total number of publications, "num_articles" the number of journal articles among them, "num_books" the number of books among them, "share_articles" the percentage of journal articles among the publications, and "share_books" the percentage of monographs and edited volumes among the publications. The four columns with column names starting with "lang_" contain the relative shares of English, French, German, Italian, and other languages among the publications of the researcher. The column "share_en_articles" contains the percentage of English journal articles among all publications of the researcher. "male" is a dummy variable indicating whether the author is male (1) or female (0). The variables contained here are described in Leifeld (2018).
ch_dist100km
is a 156 x 156 matrix containing the geographical distance between any two researchers measured in units of 100km (for a reasonable scaling of coefficients in a statistical model), computed over the latitude and longitude of their main institutional affiliations. The measure is included in Leifeld (2018).
ch_en_article_sim
is a 156 x 156 matrix containing the similarity between any two researchers in terms of the share of their work that is published in English and as journal articles. Values closer to 1.0 indicate that two researchers were similar in their language and publication type portfolio while values closer to 0 indicate that they were relatively dissimilar. Only extra-dyadic publications were counted in establishing this similarity. I.e., if researcher A and B co-authored, their joint publications were not included in establishing their English article share similarity. This was done to reduce endogeneity/reverse causality when modeling co-authorship as a function of English article share similarity. The measure is described in Leifeld (2018).
ch_topicsim
is a 156 x 156 topic similarity matrix for the researchers. Topic similarities were computed by taking into account all words in the publication titles of any two researchers, excluding the publications they published as co-authors (i.e., only extra-dyadic publications, to reduce endogeneity/reverse causality in modeling co-authorship ties as a function of topic similarity). Topic similarity was established by computing the cosine similarity between the tf-idf scores for the title words of any two researchers (i.e., a vector space model). Leifeld (2018) contains more details on this procedure.
The Swiss political science co-authorship network 2013 dataset contains the co-authorship network of all political scientists at Swiss universities and research institutes in late 2013. The data are described in Leifeld and Ingold (2016) and Leifeld (2018). The data contained here include postdoctoral and professorial researchers but not PhD students, as in Leifeld (2018), without the PhD researchers included in Leifeld and Ingold (2016). For the full dataset, see the replication archive at DOI doi:10.7910/DVN/85SK1M.
Leifeld and Ingold (2016) summarize the data collection strategy as follows: "Data gathering took place between July and December 2013. A single coder pursued a three-step coding procedure: he first created a list of all relevant university departments and research institutes that host political scientists in Switzerland, then he browsed the websites of these institutes and entered all researchers along with several details about them into a database, including their seniority status (predoctoral, postdoctoral, or professor) and the URL of their publication list (either the CV, the institutional website, a private homepage, or several of those items in order to get a complete publication profile of each person). After entering all researchers of an institute, the coder went through the researchers' publication lists and entered the following pieces of information for each publication into the database: the reporting author, the names of all co-authors, the title of the publication, the year, the name of the journal or book in which the publication appeared (if applicable), the names of all editors (if applicable), and a classification of the type of publication (academic journal, book chapter, monograph, edited volume, other). Most publications are relatively recent, but the earliest publications in the database date back to the 1960s. After completing these three steps, data entered at the beginning was double-checked in order to avoid bias due to new publications that may have shown up during the coding time period. This procedure is the best one can do in terms of completeness, but it should be clear that it crucially depends on the accuracy of the self-reported bibliographic information. For example, if a researcher did not update his or her CV or list of publications for the previous six months, those most recent publications only had a chance to enter the database if the co-authors listed the publication on their website. In some relatively rare cases, all authors failed to report recent updates, and this may cause minor inaccuracies in the network dataset, mostly affecting very recent publications in 2013 because there is, on average, a reporting lag."
Based on the collected publication data, a co-authorship network matrix with 156 nodes was created. In addition to this matrix, the dataset here contains node attribute data (institutional affiliations, location, demographics, language shares, publication type shares) and relational covariates (geographical distance, similarity in terms of the share of English articles, and topic similarity) as described in Leifeld (2018). The dataset can be used to replicate Leifeld (2018), but only approximately due to changes in the estimation routine in the ergm package since the article was published.
The data were collected from public information online. The full data collection details are described in Leifeld and Ingold (2016).
Leifeld, Philip (2018): Polarization in the Social Sciences: Assortative Mixing in Social Science Collaboration Networks is Resilient to Interventions. Physica A: Statistical Mechanics and its Applications 507: 510–523. doi:10.1016/j.physa.2018.05.109. Full replication data: doi:10.7910/DVN/85SK1M.
Leifeld, Philip and Karin Ingold (2016): Co-authorship Networks in Swiss Political Research. Swiss Political Science Review 22(2): 264–287. doi:10.1111/spsr.12193.
## Not run: # Replication code for the full Swiss co-authorship ERGM in Leifeld (2018). # Note that the estimates can only be reproduced approximately due to # internal changes in the ergm package. library("network") library("ergm") data("ch_coauthor") # set up network object with node attributes ch_nw <- network(ch_coaut, directed = FALSE) set.vertex.attribute(ch_nw, "frequency", ch_nodeattr$num_publications) set.vertex.attribute(ch_nw, "status", as.character(ch_nodeattr$status)) set.vertex.attribute(ch_nw, "male", ch_nodeattr$male) set.vertex.attribute(ch_nw, "share_en_articles", ch_nodeattr$share_en_articles) # create same affiliation matrix ch_inst_indices <- which(grepl("^inst_.+", colnames(ch_nodeattr))) ch_same_affiliation <- as.matrix(ch_nodeattr[, ch_inst_indices]) %*% t(ch_nodeattr[, ch_inst_indices]) # create same chair matrix ch_nodeattr$chairtitle[ch_nodeattr$chairtitle == ""] <- NA ch_same_chair <- matrix(0, nrow = nrow(ch_same_affiliation), ncol = ncol(ch_same_affiliation)) for (i in 1:length(ch_nodeattr$chairtitle)) { for (j in 1:length(ch_nodeattr$chairtitle)) { if (i != j && !is.na(ch_nodeattr$chairtitle[i]) && !is.na(ch_nodeattr$chairtitle[j]) && ch_nodeattr$chairtitle[i] == ch_nodeattr$chairtitle[j] && ch_same_affiliation[i, j] == TRUE) { ch_same_chair[i, j] <- 1 } } } rownames(ch_same_chair) <- rownames(ch_same_affiliation) colnames(ch_same_chair) <- colnames(ch_same_affiliation) # create supervision matrix (same chair + affiliation + mixed seniority) ch_supervision <- ch_same_affiliation * ch_same_chair * matrix(ch_nodeattr$status == "professor", nrow = nrow(ch_same_chair), ncol = ncol(ch_same_chair), byrow = FALSE) * matrix(ch_nodeattr$status != "professor", nrow = nrow(ch_same_chair), ncol = ncol(ch_same_chair), byrow = TRUE) # ERGM estimation ch_model <- ergm(ch_nw ~ edges + gwesp(0.3, fixed = TRUE) + gwdegree(0.4, fixed = TRUE) + nodecov("frequency") + nodefactor("status") + nodefactor("male") + nodematch("male") + edgecov(ch_dist100km) + edgecov(ch_same_affiliation) + edgecov(ch_same_chair) + edgecov(ch_supervision) + edgecov(ch_topicsim) + nodecov("share_en_articles") + edgecov(ch_en_article_sim), control = control.ergm(MCMLE.termination = "Hummel", MCMLE.effectiveSize = NULL)) summary(ch_model) # corresponds Column 1 in Table 3 in Leifeld (2018) ## End(Not run)
## Not run: # Replication code for the full Swiss co-authorship ERGM in Leifeld (2018). # Note that the estimates can only be reproduced approximately due to # internal changes in the ergm package. library("network") library("ergm") data("ch_coauthor") # set up network object with node attributes ch_nw <- network(ch_coaut, directed = FALSE) set.vertex.attribute(ch_nw, "frequency", ch_nodeattr$num_publications) set.vertex.attribute(ch_nw, "status", as.character(ch_nodeattr$status)) set.vertex.attribute(ch_nw, "male", ch_nodeattr$male) set.vertex.attribute(ch_nw, "share_en_articles", ch_nodeattr$share_en_articles) # create same affiliation matrix ch_inst_indices <- which(grepl("^inst_.+", colnames(ch_nodeattr))) ch_same_affiliation <- as.matrix(ch_nodeattr[, ch_inst_indices]) %*% t(ch_nodeattr[, ch_inst_indices]) # create same chair matrix ch_nodeattr$chairtitle[ch_nodeattr$chairtitle == ""] <- NA ch_same_chair <- matrix(0, nrow = nrow(ch_same_affiliation), ncol = ncol(ch_same_affiliation)) for (i in 1:length(ch_nodeattr$chairtitle)) { for (j in 1:length(ch_nodeattr$chairtitle)) { if (i != j && !is.na(ch_nodeattr$chairtitle[i]) && !is.na(ch_nodeattr$chairtitle[j]) && ch_nodeattr$chairtitle[i] == ch_nodeattr$chairtitle[j] && ch_same_affiliation[i, j] == TRUE) { ch_same_chair[i, j] <- 1 } } } rownames(ch_same_chair) <- rownames(ch_same_affiliation) colnames(ch_same_chair) <- colnames(ch_same_affiliation) # create supervision matrix (same chair + affiliation + mixed seniority) ch_supervision <- ch_same_affiliation * ch_same_chair * matrix(ch_nodeattr$status == "professor", nrow = nrow(ch_same_chair), ncol = ncol(ch_same_chair), byrow = FALSE) * matrix(ch_nodeattr$status != "professor", nrow = nrow(ch_same_chair), ncol = ncol(ch_same_chair), byrow = TRUE) # ERGM estimation ch_model <- ergm(ch_nw ~ edges + gwesp(0.3, fixed = TRUE) + gwdegree(0.4, fixed = TRUE) + nodecov("frequency") + nodefactor("status") + nodefactor("male") + nodematch("male") + edgecov(ch_dist100km) + edgecov(ch_same_affiliation) + edgecov(ch_same_chair) + edgecov(ch_supervision) + edgecov(ch_topicsim) + nodecov("share_en_articles") + edgecov(ch_en_article_sim), control = control.ergm(MCMLE.termination = "Hummel", MCMLE.effectiveSize = NULL)) summary(ch_model) # corresponds Column 1 in Table 3 in Leifeld (2018) ## End(Not run)
Constructor for btergm objects.
createBtergm( coef, boot, R, nobs, time.steps, formula, formula2, response, effects, weights, auto.adjust, offset, directed, bipartite, nvertices, data )
createBtergm( coef, boot, R, nobs, time.steps, formula, formula2, response, effects, weights, auto.adjust, offset, directed, bipartite, nvertices, data )
coef |
Object of class |
boot |
Object of class |
R |
Object of class |
nobs |
Object of class |
time.steps |
Object of class |
formula |
Object of class |
formula2 |
The revised formula with the object references after applying
the |
response |
Object of class |
effects |
Object of class |
weights |
Object of class |
auto.adjust |
Object of class |
offset |
Object of class |
directed |
Object of class |
bipartite |
Object of class |
nvertices |
Number of vertices. |
data |
The data after processing by the |
Create an S4 btergm object using this constructor function.
Philip Leifeld
Other tergm-classes:
btergm-class
,
createMtergm()
,
createTbergm()
,
mtergm-class
,
tbergm-class
Constructor for mtergm objects.
createMtergm( coef, se, pval, nobs, time.steps, formula, formula2, auto.adjust, offset, directed, bipartite, estimate, loglik, aic, bic, ergm, nvertices, data )
createMtergm( coef, se, pval, nobs, time.steps, formula, formula2, auto.adjust, offset, directed, bipartite, estimate, loglik, aic, bic, ergm, nvertices, data )
coef |
Object of class |
se |
Standard errors. |
pval |
The p-values. |
nobs |
Object of class |
time.steps |
Object of class |
formula |
Object of class |
formula2 |
The revised formula with the object references after applying
the |
auto.adjust |
Object of class |
offset |
Object of class |
directed |
Object of class |
bipartite |
Object of class |
estimate |
Estimate: either MCMC MLE or MPLE. |
loglik |
Log likelihood of the MLE. |
aic |
Akaike's Information Criterion. |
bic |
Bayesian Information Criterion. |
ergm |
The original |
nvertices |
Number of vertices. |
data |
The data after processing by the |
Create an S4 mtergm object using this constructor function.
Philip Leifeld
Other tergm-classes:
btergm-class
,
createBtergm()
,
createTbergm()
,
mtergm-class
,
tbergm-class
Constructor for tbergm objects.
createTbergm( time.steps, formula, formula2, auto.adjust, offset, directed, bipartite, estimate, bergm, nvertices, data )
createTbergm( time.steps, formula, formula2, auto.adjust, offset, directed, bipartite, estimate, bergm, nvertices, data )
time.steps |
Object of class |
formula |
Object of class |
formula2 |
The revised formula with the object references after applying
the |
auto.adjust |
Object of class |
offset |
Object of class |
directed |
Object of class |
bipartite |
Object of class |
estimate |
Estimate: |
bergm |
The original |
nvertices |
Number of vertices. |
data |
The data after processing by the |
Create an S4 tbergm object using this constructor function.
Philip Leifeld
Other tergm-classes:
btergm-class
,
createBtergm()
,
createMtergm()
,
mtergm-class
,
tbergm-class
Create all predicted tie probabilities using MPLE.
edgeprob(object, verbose = FALSE)
edgeprob(object, verbose = FALSE)
object |
An |
verbose |
Print details? |
For a given (T)ERGM, return a data frame with all predicted edge
probabilities along with the design matrix of the MPLE logit model, based
on the estimated coefficients and the design matrix, for all time points,
along with i
, j
, and t
variables indicating where the
respective dyad is located.
edgeprob
is a convenience function that creates a data frame with all
dyads in the ERGM or TERGM along with their edge probabilities and their
predictor values (i.e., change statistics). This is useful for creating
marginal effects plots or contrasting multiple groups of dyads. This function
works faster than the interpret
function.
The first variable in the resulting data frame contains the edge value (i.e., the dependent variable, which is usually binary). The next variables contain all the predictors from the ERGM or TERGM (i.e., the change statistics). The next five variables contain the indices of the sender (i), the receiver (j), the time step (t), the vertex id of i (i.name), and the vertex id of j (j.name). These five variables serve to identify the dyad. The last variable contains the computed edge probabilities.
Other interpretation:
interpret()
,
marginalplot()
Extract the model formula from a fitted object.
getformula(x) ## S4 method for signature 'ergm' getformula(x) ## S4 method for signature 'btergm' getformula(x) ## S4 method for signature 'mtergm' getformula(x) ## S4 method for signature 'tbergm' getformula(x)
getformula(x) ## S4 method for signature 'ergm' getformula(x) ## S4 method for signature 'btergm' getformula(x) ## S4 method for signature 'mtergm' getformula(x) ## S4 method for signature 'tbergm' getformula(x)
x |
A fitted model. |
The getformula
function will extract the formula from a fitted model.
getformula(ergm)
: Extract the formula from an ergm
object.
getformula(btergm)
: Extract the formula from a btergm
object.
getformula(mtergm)
: Extract the formula from an mtergm
object.
getformula(tbergm)
: Extract the formula from a tbergm
object.
Assess goodness of fit of btergm
and other network models.
gof(object, ...) createGOF( simulations, target, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), parallel = "no", ncpus = 1, cl = NULL, verbose = TRUE, ... ) ## S4 method for signature 'btergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'ergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'mtergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'tbergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'sienaFit' gof( object, period = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, structzero = 10, statistics = c(esp, deg, ideg, geodesic, rocpr, walktrap.modularity), groupName = object$f$groupNames[[1]], varName = NULL, outofsample = FALSE, sienaData = NULL, sienaEffects = NULL, nsim = NULL, verbose = TRUE, ... ) ## S4 method for signature 'network' gof( object, covariates, coef, target = NULL, nsim = 100, mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'matrix' gof( object, covariates, coef, target = NULL, nsim = 100, mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... )
gof(object, ...) createGOF( simulations, target, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), parallel = "no", ncpus = 1, cl = NULL, verbose = TRUE, ... ) ## S4 method for signature 'btergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'ergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'mtergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'tbergm' gof( object, target = NULL, formula = getformula(object), nsim = 100, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'sienaFit' gof( object, period = NULL, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, structzero = 10, statistics = c(esp, deg, ideg, geodesic, rocpr, walktrap.modularity), groupName = object$f$groupNames[[1]], varName = NULL, outofsample = FALSE, sienaData = NULL, sienaEffects = NULL, nsim = NULL, verbose = TRUE, ... ) ## S4 method for signature 'network' gof( object, covariates, coef, target = NULL, nsim = 100, mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... ) ## S4 method for signature 'matrix' gof( object, covariates, coef, target = NULL, nsim = 100, mcmc = FALSE, MCMC.interval = 1000, MCMC.burnin = 10000, parallel = c("no", "multicore", "snow"), ncpus = 1, cl = NULL, statistics = c(dsp, esp, deg, ideg, geodesic, rocpr, walktrap.modularity), verbose = TRUE, ... )
object |
A |
... |
Arbitrary further arguments to be passed on to the statistics. See also the help page for the gof-statistics. |
simulations |
A list of |
target |
In the |
statistics |
A list of functions used for comparison of observed and simulated networks. Note that the list should contain the actual functions, not a character representation of them. See gof-statistics for details. |
parallel |
Use multiple cores in a computer or nodes in a cluster to
speed up the simulations. The default value |
ncpus |
The number of CPU cores used for parallel GOF assessment (only
if |
cl |
An optional parallel or snow cluster for use if
|
verbose |
Print details? |
formula |
A model formula from which networks are simulated for
comparison. By default, the formula from the |
nsim |
The number of networks to be simulated at each time step.
Example: If there are six time steps in the |
MCMC.interval |
Internally, this package uses the simulation facilities
of the ergm package to create new networks against which to compare
the original network(s) for goodness-of-fit assessment. This argument sets
the MCMC interval to be passed over to the simulation command. The default
value is |
MCMC.burnin |
Internally, this package uses the simulation facilities of
the ergm package to create new networks against which to compare the
original network(s) for goodness-of-fit assessment. This argument sets the
MCMC burnin to be passed over to the simulation command. The default value
is |
period |
Which transition between time periods should be used for GOF
assessment? By default, all transitions between all time periods are used.
For example, if there are three consecutive networks, this will extract
simulations from the transitions between 1 and 2 and between 2 and 3,
respectively, and these simulations will be compared to the networks at
time steps 2 and 3, respectively. The time period can be provided as a
numeric, e.g., |
structzero |
Which value was used for structural zeros (usually nodes
that have dropped out of the network or have not yet joined the network) in
the dependent variable/network? These nodes are removed from the observed
network and the simulations before comparison. Usually, the value |
groupName |
The group name used in the Siena model. |
varName |
The variable name that denotes the dependent networks in the Siena model. |
outofsample |
Should out-of-sample prediction be attempted? If so, some
additional arguments must be provided: |
sienaData |
An object of the class |
sienaEffects |
An object of the class |
covariates |
A list of matrices or network objects that serve as
covariates for the dependent network. The covariates in this list are
automatically added to the formula as |
coef |
A vector of coefficients. |
mcmc |
Should statnet's MCMC methods be used for simulating new
networks? If |
The generic gof
function provides goodness-of-fit measures and
degeneracy checks for btergm
, mtergm
, tbergm
,
ergm
, sienaFit
, and custom dyadic-independent models. The user
can provide a list of network statistics for comparing simulated networks
based on the estimated model with the observed network(s). See
gof-statistics
. The objects created by these methods can be
displayed using various plot and print methods (see gof-plot
).
In-sample GOF assessment is the default, which means that the same time steps
are used for creating simulations and for comparison with the observed
network(s). It is possible to do out-of-sample prediction by specifying a
(list of) target network(s) using the target
argument. If a formula is
provided, the simulations are based on the networks and covariates specified
in the formula. This is helpful in situations where complex out-of-sample
predictions have to be evaluated. A usage scenario could be to simulate from
a network at time t
(provided through the formula
argument) and
compare to an observed network at time t + 1
(the target
argument). This can be done, for example, to assess predictive performance
between time steps of the original networks, or to check whether the model
performs well with regard to a newly measured network given the old data from
the previous time step.
Predictive fit can also be assessed for stochastic actor-oriented models
(SAOM) as implemented in the RSiena package. After compiling the usual
objects (model, data, effects), one of the time steps can be predicted based
on the previous time step and the SAOM using the sienaFit
method of
the gof
function. By default, however, within-sample fit is used for
SAOMs, just like for (T)ERGMs.
The gof
methods for networks and matrices serve to assess the goodness
of fit of a dyadic-independence model. To do this, the method requires a
vector of coefficients (one coefficient for the intercept or edges
term and one coefficient for each covariate), a list of covariates (in matrix
or network shape), and a dependent network or matrix. This is useful for
assessing the goodness of fit of QAP-adjusted logistic regression models (as
implemented in the netlogit
function in the sna package) or
other dyadic-independence models, such as models fitted using glm
.
Note that this method only works with cross-sectional models and does not
accept lists of networks as input data.
The createGOF
function is used internally by the gof
function
in order to create a gof
object from a list of simulated networks and
a list of target networks to compare against. It can also be used directly by
the end user if the user wants to supply lists of simulated and target
networks from other sources.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
Leifeld, Philip and Skyler J. Cranmer (2019): A Theoretical and Empirical Comparison of the Temporal Exponential Random Graph Model and the Stochastic Actor-Oriented Model. Network Science 7(1): 20–51. doi:10.1017/nws.2018.26.
Plot and print methods for goodness-of-fit output for network models.
## S3 method for class 'boxplot' print(x, ...) ## S3 method for class 'roc' print(x, ...) ## S3 method for class 'pr' print(x, ...) ## S3 method for class 'rocpr' print(x, ...) ## S3 method for class 'univariate' print(x, ...) ## S3 method for class 'gof' print(x, ...) ## S3 method for class 'gof' plot(x, mfrow = TRUE, ...) ## S3 method for class 'boxplot' plot( x, relative = TRUE, transform = function(x) x, xlim = NULL, main = x$label, xlab = x$label, ylab = "Frequency", border = "darkgray", boxplot.lwd = 0.8, outline = FALSE, median = TRUE, median.col = "black", median.lty = "solid", median.lwd = 2, mean = TRUE, mean.col = "black", mean.lty = "dashed", mean.lwd = 1, ... ) ## S3 method for class 'roc' plot( x, add = FALSE, main = x$label, avg = c("none", "horizontal", "vertical", "threshold"), spread.estimate = c("boxplot", "stderror", "stddev"), lwd = 3, rgraph = FALSE, col = "#bd0017", random.col = "#bd001744", ... ) ## S3 method for class 'pr' plot( x, add = FALSE, main = x$label, avg = c("none", "horizontal", "vertical", "threshold"), spread.estimate = c("boxplot", "stderror", "stddev"), lwd = 3, rgraph = FALSE, col = "#5886be", random.col = "#5886be44", pr.poly = 0, ... ) ## S3 method for class 'rocpr' plot( x, main = x$label, roc.avg = c("none", "horizontal", "vertical", "threshold"), roc.spread.estimate = c("boxplot", "stderror", "stddev"), roc.lwd = 3, roc.rgraph = FALSE, roc.col = "#bd0017", roc.random.col = "#bd001744", pr.avg = c("none", "horizontal", "vertical", "threshold"), pr.spread.estimate = c("boxplot", "stderror", "stddev"), pr.lwd = 3, pr.rgraph = FALSE, pr.col = "#5886be", pr.random.col = "#5886be44", pr.poly = 0, ... ) ## S3 method for class 'univariate' plot( x, main = x$label, sim.hist = TRUE, sim.bar = TRUE, sim.density = TRUE, obs.hist = FALSE, obs.bar = TRUE, obs.density = TRUE, sim.adjust = 1, obs.adjust = 1, sim.lwd = 2, obs.lwd = 2, sim.col = "black", obs.col = "red", ... )
## S3 method for class 'boxplot' print(x, ...) ## S3 method for class 'roc' print(x, ...) ## S3 method for class 'pr' print(x, ...) ## S3 method for class 'rocpr' print(x, ...) ## S3 method for class 'univariate' print(x, ...) ## S3 method for class 'gof' print(x, ...) ## S3 method for class 'gof' plot(x, mfrow = TRUE, ...) ## S3 method for class 'boxplot' plot( x, relative = TRUE, transform = function(x) x, xlim = NULL, main = x$label, xlab = x$label, ylab = "Frequency", border = "darkgray", boxplot.lwd = 0.8, outline = FALSE, median = TRUE, median.col = "black", median.lty = "solid", median.lwd = 2, mean = TRUE, mean.col = "black", mean.lty = "dashed", mean.lwd = 1, ... ) ## S3 method for class 'roc' plot( x, add = FALSE, main = x$label, avg = c("none", "horizontal", "vertical", "threshold"), spread.estimate = c("boxplot", "stderror", "stddev"), lwd = 3, rgraph = FALSE, col = "#bd0017", random.col = "#bd001744", ... ) ## S3 method for class 'pr' plot( x, add = FALSE, main = x$label, avg = c("none", "horizontal", "vertical", "threshold"), spread.estimate = c("boxplot", "stderror", "stddev"), lwd = 3, rgraph = FALSE, col = "#5886be", random.col = "#5886be44", pr.poly = 0, ... ) ## S3 method for class 'rocpr' plot( x, main = x$label, roc.avg = c("none", "horizontal", "vertical", "threshold"), roc.spread.estimate = c("boxplot", "stderror", "stddev"), roc.lwd = 3, roc.rgraph = FALSE, roc.col = "#bd0017", roc.random.col = "#bd001744", pr.avg = c("none", "horizontal", "vertical", "threshold"), pr.spread.estimate = c("boxplot", "stderror", "stddev"), pr.lwd = 3, pr.rgraph = FALSE, pr.col = "#5886be", pr.random.col = "#5886be44", pr.poly = 0, ... ) ## S3 method for class 'univariate' plot( x, main = x$label, sim.hist = TRUE, sim.bar = TRUE, sim.density = TRUE, obs.hist = FALSE, obs.bar = TRUE, obs.density = TRUE, sim.adjust = 1, obs.adjust = 1, sim.lwd = 2, obs.lwd = 2, sim.col = "black", obs.col = "red", ... )
x |
An object created by one of the |
... |
Arbitrary further arguments. |
mfrow |
Should the GOF plots come out separately ( |
relative |
Print relative frequencies (as opposed to absolute frequencies) of a statistic on the y axis? |
transform |
A function which transforms the y values used for the
boxplots. For example, if some of the values become very large and make the
output illegible, |
xlim |
Horizontal limit of the boxplots. Only the maximum value must be
provided, e.g., |
main |
Main title of a GOF plot. |
xlab |
Label of the x-axis of a GOF plot. |
ylab |
Label of the y-axis of a GOF plot. |
border |
Color of the borders of the boxplots. |
boxplot.lwd |
Line width of boxplot. |
outline |
Print outliers in the boxplots? |
median |
Plot the median curve for the observed network? |
median.col |
Color of the median of the observed network statistic. |
median.lty |
Line type of median line. For example "dashed" or "solid". |
median.lwd |
Line width of median line. |
mean |
Plot the mean curve for the observed network? |
mean.col |
Color of the mean of the observed network statistic. |
mean.lty |
Line type of mean line. For example "dashed" or "solid". |
mean.lwd |
Line width of mean line. |
add |
Add the ROC and/or PR curve to an existing plot? |
avg |
Averaging pattern for the ROC and PR curve(s) if multiple target
time steps were used. Allowed values are |
spread.estimate |
When multiple target time steps are used and curve
averaging is enabled, the variation around the average curve can be
visualized as standard error bars ( |
lwd |
Line width. |
rgraph |
Should an ROC or PR curve also be drawn for a random graph? This serves as a baseline against which to compare the actual ROC or PR curve. |
col |
Color of the ROC or PR curve. |
random.col |
Color of the ROC or PR curve of the random graph prediction. |
pr.poly |
If a value of |
roc.avg |
Averaging pattern for the ROC curve(s) if multiple target time
steps were used. Allowed values are |
roc.spread.estimate |
When multiple target time steps are used and curve
averaging is enabled, the variation around the average curve can be
visualized as standard error bars ( |
roc.lwd |
Line width. |
roc.rgraph |
Should an ROC curve also be drawn for a random graph? This serves as a baseline against which to compare the actual ROC curve. |
roc.col |
Color of the ROC curve. |
roc.random.col |
Color of the ROC curve of the random graph prediction. |
pr.avg |
Averaging pattern for the PR curve(s) if multiple target time
steps were used. Allowed values are |
pr.spread.estimate |
When multiple target time steps are used and curve
averaging is enabled, the variation around the average curve can be
visualized as standard error bars ( |
pr.lwd |
Line width. |
pr.rgraph |
Should an PR curve also be drawn for a random graph? This serves as a baseline against which to compare the actual PR curve. |
pr.col |
Color of the PR curve. |
pr.random.col |
Color of the PR curve of the random graph prediction. |
sim.hist |
Draw a histogram for the simulated networks? |
sim.bar |
Draw a bar for the median of the statistic for the simulated networks? |
sim.density |
Draw a density curve fot the statistic for the simulated networks? |
obs.hist |
Draw a histogram for the observed networks? |
obs.bar |
Draw a bar for the median of the statistic for the observed networks? |
obs.density |
Draw a density curve fot the statistic for the observed networks? |
sim.adjust |
Bandwidth adjustment parameter for the density curve. |
obs.adjust |
Bandwidth adjustment parameter for the density curve. |
sim.lwd |
Line width for the simulated networks. |
obs.lwd |
Line width for the observed network(s). |
sim.col |
Color for the simulated networks. |
obs.col |
Color for the observed network(s). |
These plot and print methods serve to display the output generated by the
gof
function and its methods. See the help page of
gof-methods
for details on how to compute goodness-of-fit
statistics.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
Statistics for goodness-of-fit assessment of network models.
dsp(mat, ...) esp(mat, ...) nsp(mat, ...) deg(mat, ...) b1deg(mat, ...) b2deg(mat, ...) odeg(mat, ...) ideg(mat, ...) kstar(mat, ...) b1star(mat, ...) b2star(mat, ...) ostar(mat, ...) istar(mat, ...) kcycle(mat, ...) geodesic(mat, ...) triad.directed(mat, ...) triad.undirected(mat, ...) comemb(vec) walktrap.modularity(mat, ...) walktrap.roc(sim, obs, ...) walktrap.pr(sim, obs, ...) fastgreedy.modularity(mat, ...) fastgreedy.roc(sim, obs, ...) fastgreedy.pr(sim, obs, ...) louvain.modularity(mat, ...) louvain.roc(sim, obs, ...) louvain.pr(sim, obs, ...) maxmod.modularity(mat, ...) maxmod.roc(sim, obs, ...) maxmod.pr(sim, obs, ...) edgebetweenness.modularity(mat, ...) edgebetweenness.roc(sim, obs, ...) edgebetweenness.pr(sim, obs, ...) spinglass.modularity(mat, ...) spinglass.roc(sim, obs, ...) spinglass.pr(sim, obs, ...) rocpr(sim, obs, roc = TRUE, pr = TRUE, joint = TRUE, pr.impute = "poly4", ...)
dsp(mat, ...) esp(mat, ...) nsp(mat, ...) deg(mat, ...) b1deg(mat, ...) b2deg(mat, ...) odeg(mat, ...) ideg(mat, ...) kstar(mat, ...) b1star(mat, ...) b2star(mat, ...) ostar(mat, ...) istar(mat, ...) kcycle(mat, ...) geodesic(mat, ...) triad.directed(mat, ...) triad.undirected(mat, ...) comemb(vec) walktrap.modularity(mat, ...) walktrap.roc(sim, obs, ...) walktrap.pr(sim, obs, ...) fastgreedy.modularity(mat, ...) fastgreedy.roc(sim, obs, ...) fastgreedy.pr(sim, obs, ...) louvain.modularity(mat, ...) louvain.roc(sim, obs, ...) louvain.pr(sim, obs, ...) maxmod.modularity(mat, ...) maxmod.roc(sim, obs, ...) maxmod.pr(sim, obs, ...) edgebetweenness.modularity(mat, ...) edgebetweenness.roc(sim, obs, ...) edgebetweenness.pr(sim, obs, ...) spinglass.modularity(mat, ...) spinglass.roc(sim, obs, ...) spinglass.pr(sim, obs, ...) rocpr(sim, obs, roc = TRUE, pr = TRUE, joint = TRUE, pr.impute = "poly4", ...)
mat |
A sparse network matrix as created by the |
... |
Additional arguments. This must be present in all auxiliary GOF statistics. |
vec |
A vector of community memberships in order to create a community co-membership matrix. |
sim |
A list of simulated networks. Each element in the list should be a
sparse matrix as created by the |
obs |
A list of observed (= target) networks. Each element in the list
should be a sparse matrix as created by the |
roc |
Compute receiver-operating characteristics (ROC)? |
pr |
Compute precision-recall curve (PR)? |
joint |
Merge all time steps into a single big prediction task and compute predictive fit (instead of computing GOF for all time steps separately)? |
pr.impute |
In some cases, the first precision value of the
precision-recall curve is undefined. The |
These functions can be plugged into the statistics
argument of the
gof
methods in order to compare observed with simulated networks (see
the gof-methods help page). There are three types of statistics:
Univariate statistics, which aggregate a network into a single
quantity. For example, modularity measures or density. The distribution
of statistics can be displayed using histograms, density plots, and
median bars. Univariate statistics take a sparse matrix (mat
)
as an argument and return a single numeric value that summarize a network
matrix.
Multivariate statistics, which aggregate a network into a vector of
quantities. For example, the distribution of geodesic distances, edgewise
shared partners, or indegree. These statistics typically have multiple
values, e.g., esp(1), esp(2), esp(3) etc. The results can be displayed
using multiple boxplots for simulated networks and a black curve for the
observed network(s). Multivariate statistics take a sparse matrix
(mat
) as an argument and return a vector of numeric values that
summarize a network matrix.
Tie prediction statistics, which predict dyad states the observed
network(s) by the dyad states in the simulated networks. For example,
receiver operating characteristics (ROC) or precision-recall curves (PR)
of simulated networks based on the model, or ROC or PR predictions of
community co-membership matrices of the simulated vs. the observed
network(s). Tie prediction statistics take a list of simulated sparse
network matrices and another list of observed sparse network matrices
(possibly containing only a single sparse matrix) as arguments and return
a rocpr
, roc
, or pr
object (as created by the
rocpr function).
Users can create their own statistics for use with the gof
methods. To
do so, one needs to write a function that accepts and returns the respective
objects described in the enumeration above. It is advisable to look at the
definitions of some of the existing functions to add custom functions. It is
also possible to add an attribute called label
to the return object,
which describes what is being returned by the function. This label will be
used as a descriptive label in the plot and for verbose output during
computations. The examples section contains an example of a custom user
statistic. Note that all statistics must contain the ...
argument to ensure that custom arguments of other statistics do not cause an
error.
To aid the development of custom statistics, the helper function
comemb
is available: it accepts a vector of community memberships and
converts it to a co-membership matrix. This function is also used internally
by statistics like walktrap.roc
and others.
dsp()
: Multivariate GOF statistic: dyad-wise shared
partner distribution
esp()
: Multivariate GOF statistic: edge-wise shared
partner distribution
nsp()
: Multivariate GOF statistic: non-edge-wise shared
partner distribution
deg()
: Multivariate GOF statistic: degree distribution
b1deg()
: Multivariate GOF statistic: degree distribution
for the first mode
b2deg()
: Multivariate GOF statistic: degree distribution
for the second mode
odeg()
: Multivariate GOF statistic: outdegree distribution
ideg()
: Multivariate GOF statistic: indegree distribution
kstar()
: Multivariate GOF statistic: k-star distribution
b1star()
: Multivariate GOF statistic: k-star distribution
for the first mode
b2star()
: Multivariate GOF statistic: k-star distribution
for the second mode
ostar()
: Multivariate GOF statistic: outgoing k-star
distribution
istar()
: Multivariate GOF statistic: incoming k-star
distribution
kcycle()
: Multivariate GOF statistic: k-cycle distribution
geodesic()
: Multivariate GOF statistic: geodesic distance
distribution
triad.directed()
: Multivariate GOF statistic: triad census in
directed networks
triad.undirected()
: Multivariate GOF statistic: triad census in
undirected networks
comemb()
: Helper function: create community co-membership
matrix
walktrap.modularity()
: Univariate GOF statistic: Walktrap modularity
distribution
walktrap.roc()
: Tie prediction GOF statistic: ROC of Walktrap
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Walktrap algorithm.
walktrap.pr()
: Tie prediction GOF statistic: PR of Walktrap
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Walktrap algorithm.
fastgreedy.modularity()
: Univariate GOF statistic: fast and greedy
modularity distribution
fastgreedy.roc()
: Tie prediction GOF statistic: ROC of fast and
greedy community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the fast and
greedy algorithm. Only sensible with undirected networks.
fastgreedy.pr()
: Tie prediction GOF statistic: PR of fast and
greedy community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the fast and greedy algorithm.
Only sensible with undirected networks.
louvain.modularity()
: Univariate GOF statistic: Louvain clustering
modularity distribution
louvain.roc()
: Tie prediction GOF statistic: ROC of Louvain
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Louvain algorithm.
louvain.pr()
: Tie prediction GOF statistic: PR of Louvain
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Louvain algorithm.
maxmod.modularity()
: Univariate GOF statistic: maximal modularity
distribution
maxmod.roc()
: Tie prediction GOF statistic: ROC of maximal
modularity community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the
modularity maximization algorithm.
maxmod.pr()
: Tie prediction GOF statistic: PR of maximal
modularity community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the modularity maximization
algorithm.
edgebetweenness.modularity()
: Univariate GOF statistic: edge betweenness
modularity distribution
edgebetweenness.roc()
: Tie prediction GOF statistic: ROC of edge
betweenness community detection. Receiver-operating characteristics of
predicting the community structure in the observed network(s) by the
community structure in the simulated networks, as computed by the
Girvan-Newman edge betweenness community detection method.
edgebetweenness.pr()
: Tie prediction GOF statistic: PR of edge
betweenness community detection. Precision-recall curve for predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Girvan-Newman edge
betweenness community detection method.
spinglass.modularity()
: Univariate GOF statistic: spinglass modularity
distribution
spinglass.roc()
: Tie prediction GOF statistic: ROC of spinglass
community detection. Receiver-operating characteristics of predicting the
community structure in the observed network(s) by the community structure
in the simulated networks, as computed by the Spinglass algorithm.
spinglass.pr()
: Tie prediction GOF statistic: PR of spinglass
community detection. Precision-recall curve for predicting the community
structure in the observed network(s) by the community structure in the
simulated networks, as computed by the Spinglass algorithm.
rocpr()
: Tie prediction GOF statistic: ROC and PR curves.
Receiver-operating characteristics (ROC) and precision-recall curve (PR).
Prediction of the dyad states of the observed network(s) by the dyad states
of the simulated networks.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
# To see how these statistics are used, look at the examples section of # ?"gof-methods". The following example illustrates how custom # statistics can be created. Suppose one is interested in the density # of a network. Then a univariate statistic can be created as follows. dens <- function(mat, ...) { # univariate: one argument mat <- as.matrix(mat) # sparse matrix -> normal matrix d <- sna::gden(mat) # compute the actual statistic attributes(d)$label <- "Density" # add a descriptive label return(d) # return the statistic } # Note that the '...' argument must be present in all statistics. # Now the statistic can be used in the statistics argument of one of # the gof methods. # For illustrative purposes, let us consider an existing statistic, the # indegree distribution, a multivariate statistic. It also accepts a # single argument. Note that the sparse matrix is converted to a # normal matrix object when it is used. First, statnet's summary # method is used to compute the statistic. Names are attached to the # resulting vector for the different indegree values. Then the vector # is returned. ideg <- function(mat, ...) { d <- summary(mat ~ idegree(0:(nrow(mat) - 1))) names(d) <- 0:(length(d) - 1) attributes(d)$label <- "Indegree" return(d) } # See the gofstatistics.R file in the package for more complex examples.
# To see how these statistics are used, look at the examples section of # ?"gof-methods". The following example illustrates how custom # statistics can be created. Suppose one is interested in the density # of a network. Then a univariate statistic can be created as follows. dens <- function(mat, ...) { # univariate: one argument mat <- as.matrix(mat) # sparse matrix -> normal matrix d <- sna::gden(mat) # compute the actual statistic attributes(d)$label <- "Density" # add a descriptive label return(d) # return the statistic } # Note that the '...' argument must be present in all statistics. # Now the statistic can be used in the statistics argument of one of # the gof methods. # For illustrative purposes, let us consider an existing statistic, the # indegree distribution, a multivariate statistic. It also accepts a # single argument. Note that the sparse matrix is converted to a # normal matrix object when it is used. First, statnet's summary # method is used to compute the statistic. Names are attached to the # resulting vector for the different indegree values. Then the vector # is returned. ideg <- function(mat, ...) { d <- summary(mat ~ idegree(0:(nrow(mat) - 1))) names(d) <- 0:(length(d) - 1) attributes(d)$label <- "Indegree" return(d) } # See the gofstatistics.R file in the package for more complex examples.
Process NA
values (= remove nodes with NA
s iteratively).
handleMissings(mat, na = NA, method = "remove", logical = FALSE)
handleMissings(mat, na = NA, method = "remove", logical = FALSE)
mat |
A matrix object. |
na |
The value that missing data are coded as. Usually |
method |
What should be done with the missing data? If
|
logical |
Return a matrix with logical values indicating which cells should be removed? By default the manipulated matrix is returned. |
This function deals with missing data in matrices or network objects used for
inferential network analysis. It can either remove missing rows and/or
columns iteratively (rows and columns with more NA
values first, then
successively rows and columns with fewer NA
entries) or replace
missing values by the modal value of the matrix or by 0
. The function
can return either the manipulated matrix or a matrix with logical values
indicating which of the cells should be removed.
Either a matrix in which missing data were taken care of or a matrix indicating where missing data are located.
Micro-level interpretation of (T)ERGMs.
interpret(object, ...) ## S4 method for signature 'ergm' interpret( object, formula = getformula(object), coefficients = coef(object), target = NULL, type = "tie", i, j ) ## S4 method for signature 'btergm' interpret( object, formula = getformula(object), coefficients = coef(object), target = NULL, type = "tie", i, j, t = 1:[email protected] ) ## S4 method for signature 'mtergm' interpret( object, formula = getformula(object), coefficients = coef(object), target = NULL, type = "tie", i, j, t = 1:[email protected] )
interpret(object, ...) ## S4 method for signature 'ergm' interpret( object, formula = getformula(object), coefficients = coef(object), target = NULL, type = "tie", i, j ) ## S4 method for signature 'btergm' interpret( object, formula = getformula(object), coefficients = coef(object), target = NULL, type = "tie", i, j, t = 1:object@time.steps ) ## S4 method for signature 'mtergm' interpret( object, formula = getformula(object), coefficients = coef(object), target = NULL, type = "tie", i, j, t = 1:object@time.steps )
object |
An |
... |
Further arguments to be passed on to subroutines. |
formula |
The formula to be used for computing probabilities. By default, the formula embedded in the model object is retrieved and used. |
coefficients |
The estimates on which probabilities should be based. By
default, the coefficients from the model object are retrieved and used.
Custom coefficients can be handed over, for example, in order to compare
versions of the model where the reciprocity term is fixed at |
target |
The response network on which probabilities are based.
Depending on whether the function is applied to an |
type |
If |
i |
A single (sender) node |
j |
A single (receiver) node |
t |
A vector of (numerical) time steps for which the probabilities
should be computed. This only applies to |
The interpret
function facilitates interpretation of ERGMs and TERGMs
at the micro level, as described in Desmarais and Cranmer (2012). There are
methods for ergm
objects, btergm
objects, and mtergm
objects. The function can be used to interpret these models at the tie or
edge level, dyad level, and block level. For example, what is the probability
that two specific nodes i
(the sender) and j
(the receiver) are
connected given the rest of the network and given the model? Or what is the
probability that any two nodes are tied at t = 2
if they were tied (or
disconnected) at t = 1
(i.e., what is the amount of tie stability)?
These tie- or edge-level questions can be answered if the type = "tie"
argument is used.
Another example: What is the probability that node i
has a tie to node
j
but not vice-versa? Or that i
and j
maintain a
reciprocal tie? Or that they are disconnected? How much more or less likely
are i
and j
reciprocally connected if the mutual
term in
the model is fixed at 0
(compared to the model that includes the
estimated parameter for reciprocity)? See example below. These dyad-level
questions can be answered if the type = "dyad"
argument is used.
Or what is the probability that a specific node i
is connected to
nodes j1
and j2
but not to j5
and j7
? And how
likely is any node i
to be connected to exactly four j
nodes?
These node-level questions (focusing on the ties of node i
or node
j
) can be answered by using the type = "node"
argument.
The typical procedure is to manually enumerate all dyads or
sender-receiver-time combinations with certain properties and repeat the same
thing with some alternative properties for contrasting the two groups. Then
apply the interpret
function to the two groups of dyads and compute a
measure of central tendency (e.g., mean or median) and possibly some
uncertainy measure (i.e., confidence intervals) from the distribution of
dyadic probabilities in each group. For example, if there is a gender
attribute, one can sample male-male or female-female dyads, compute the
distributions of edge probabilities for the two sets of dyads, and create
boxplots or barplots with confidence intervals for the two types of dyads in
order to contrast edge probabilities for male versus female same-sex dyads.
See also the edgeprob
function for automatic computation of all
dyadic edge probabilities.
interpret(ergm)
: Interpret method for ergm
objects
interpret(btergm)
: Interpret method for btergm
objects
interpret(mtergm)
: Interpret method for mtergm
objects
Desmarais, Bruce A. and Skyler J. Cranmer (2012): Micro-Level Interpretation of Exponential Random Graph Models with Application to Estuary Networks. Policy Studies Journal 40(3): 402–434. doi:10.1111/j.1541-0072.2012.00459.x.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2017): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
Czarna, Anna Z., Philip Leifeld, Magdalena Smieja, Michael Dufner and Peter Salovey (2016): Do Narcissism and Emotional Intelligence Win Us Friends? Modeling Dynamics of Peer Popularity Using Inferential Network Analysis. Personality and Social Psychology Bulletin 42(11): 1588–1599. doi:10.1177/0146167216666265.
Other interpretation:
edgeprob()
,
marginalplot()
##### The following example is a TERGM adaptation of the ##### ##### dyad-level example provided in figure 5(c) on page ##### ##### 424 of Desmarais and Cranmer (2012) in the PSJ. At ##### ##### each time step, it compares dyadic probabilities ##### ##### (no tie, unidirectional tie, and reciprocal tie ##### ##### probability) between a fitted model and a model ##### ##### where the reciprocity effect is fixed at 0 based ##### ##### on 20 randomly selected dyads per time step. The ##### ##### results are visualized using a grouped bar plot. ##### ## Not run: # create toy dataset and fit a model networks <- list() for (i in 1:3) { # create 3 random networks with 10 actors mat <- matrix(rbinom(100, 1, 0.25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network(mat) # create network object networks[[i]] <- nw # add network to the list } fit <- btergm(networks ~ edges + istar(2) + mutual, R = 200) # extract coefficients and create null hypothesis vector null <- coef(fit) # estimated coefs null[3] <- 0 # set mutual term = 0 # sample 20 dyads per time step and compute probability ratios probabilities <- matrix(nrow = 9, ncol = length(networks)) # nrow = 9 because three probabilities + upper and lower CIs colnames(probabilities) <- paste("t =", 1:length(networks)) for (t in 1:length(networks)) { d <- dim(as.matrix(networks[[t]])) # how many row and column nodes? size <- d[1] * d[2] # size of the matrix nw <- matrix(1:size, nrow = d[1], ncol = d[2]) nw <- nw[lower.tri(nw)] # sample only from lower triangle b/c samp <- sample(nw, 20) # dyadic probabilities are symmetric prob.est.00 <- numeric(0) prob.est.01 <- numeric(0) prob.est.11 <- numeric(0) prob.null.00 <- numeric(0) prob.null.01 <- numeric(0) prob.null.11 <- numeric(0) for (k in 1:20) { i <- arrayInd(samp[k], d)[1, 1] # recover 'i's and 'j's from sample j <- arrayInd(samp[k], d)[1, 2] # run interpretation function with estimated coefs and mutual = 0: int.est <- interpret(fit, type = "dyad", i = i, j = j, t = t) int.null <- interpret(fit, coefficients = null, type = "dyad", i = i, j = j, t = t) prob.est.00 <- c(prob.est.00, int.est[[1]][1, 1]) prob.est.11 <- c(prob.est.11, int.est[[1]][2, 2]) mean.est.01 <- (int.est[[1]][1, 2] + int.est[[1]][2, 1]) / 2 prob.est.01 <- c(prob.est.01, mean.est.01) prob.null.00 <- c(prob.null.00, int.null[[1]][1, 1]) prob.null.11 <- c(prob.null.11, int.null[[1]][2, 2]) mean.null.01 <- (int.null[[1]][1, 2] + int.null[[1]][2, 1]) / 2 prob.null.01 <- c(prob.null.01, mean.null.01) } prob.ratio.00 <- prob.est.00 / prob.null.00 # ratio of est. and null hyp prob.ratio.01 <- prob.est.01 / prob.null.01 prob.ratio.11 <- prob.est.11 / prob.null.11 probabilities[1, t] <- mean(prob.ratio.00) # mean estimated 00 tie prob probabilities[2, t] <- mean(prob.ratio.01) # mean estimated 01 tie prob probabilities[3, t] <- mean(prob.ratio.11) # mean estimated 11 tie prob ci.00 <- t.test(prob.ratio.00, conf.level = 0.99)$conf.int ci.01 <- t.test(prob.ratio.01, conf.level = 0.99)$conf.int ci.11 <- t.test(prob.ratio.11, conf.level = 0.99)$conf.int probabilities[4, t] <- ci.00[1] # lower 00 conf. interval probabilities[5, t] <- ci.01[1] # lower 01 conf. interval probabilities[6, t] <- ci.11[1] # lower 11 conf. interval probabilities[7, t] <- ci.00[2] # upper 00 conf. interval probabilities[8, t] <- ci.01[2] # upper 01 conf. interval probabilities[9, t] <- ci.11[2] # upper 11 conf. interval } # create barplots from probability ratios and CIs require("gplots") bp <- barplot2(probabilities[1:3, ], beside = TRUE, plot.ci = TRUE, ci.l = probabilities[4:6, ], ci.u = probabilities[7:9, ], col = c("tan", "tan2", "tan3"), ci.col = "grey40", xlab = "Dyadic tie values", ylab = "Estimated Prob./Null Prob.") mtext(1, at = bp, text = c("(0,0)", "(0,1)", "(1,1)"), line = 0, cex = 0.5) ##### The following examples illustrate the behavior of ##### ##### the interpret function with undirected and/or ##### ##### bipartite graphs with or without structural zeros. ##### library("statnet") library("btergm") # micro-level interpretation for undirected network with structural zeros set.seed(12345) mat <- matrix(rbinom(400, 1, 0.1), nrow = 20, ncol = 20) mat[1, 5] <- 1 mat[10, 7] <- 1 mat[15, 3] <- 1 mat[18, 4] < 1 nw <- network(mat, directed = FALSE, bipartite = FALSE) cv <- matrix(rnorm(400), nrow = 20, ncol = 20) offsetmat <- matrix(rbinom(400, 1, 0.1), nrow = 20, ncol = 20) offsetmat[1, 5] <- 1 offsetmat[10, 7] <- 1 offsetmat[15, 3] <- 1 offsetmat[18, 4] < 1 model <- ergm(nw ~ edges + kstar(2) + edgecov(cv) + offset(edgecov(offsetmat)), offset.coef = -Inf) summary(model) # tie-level interpretation (note that dyad interpretation would not make any # sense in an undirected network): interpret(model, type = "tie", i = 1, j = 2) # 0.28 (= normal dyad) interpret(model, type = "tie", i = 1, j = 5) # 0.00 (= structural zero) # node-level interpretation; note the many 0 probabilities due to the # structural zeros; also note the warning message that the probabilities may # be slightly imprecise because -Inf needs to be approximated by some large # negative number (-9e8): interpret(model, type = "node", i = 1, j = 3:5) # repeat the same exercise for a directed network nw <- network(mat, directed = TRUE, bipartite = FALSE) model <- ergm(nw ~ edges + istar(2) + edgecov(cv) + offset(edgecov(offsetmat)), offset.coef = -Inf) interpret(model, type = "tie", i = 1, j = 2) # 0.13 (= normal dyad) interpret(model, type = "tie", i = 1, j = 5) # 0.00 (= structural zero) interpret(model, type = "dyad", i = 1, j = 2) # results for normal dyad interpret(model, type = "dyad", i = 1, j = 5) # results for i->j struct. zero interpret(model, type = "node", i = 1, j = 3:5) # micro-level interpretation for bipartite graph with structural zeros set.seed(12345) mat <- matrix(rbinom(200, 1, 0.1), nrow = 20, ncol = 10) mat[1, 5] <- 1 mat[10, 7] <- 1 mat[15, 3] <- 1 mat[18, 4] < 1 nw <- network(mat, directed = FALSE, bipartite = TRUE) cv <- matrix(rnorm(200), nrow = 20, ncol = 10) # some covariate offsetmat <- matrix(rbinom(200, 1, 0.1), nrow = 20, ncol = 10) offsetmat[1, 5] <- 1 offsetmat[10, 7] <- 1 offsetmat[15, 3] <- 1 offsetmat[18, 4] < 1 model <- ergm(nw ~ edges + b1star(2) + edgecov(cv) + offset(edgecov(offsetmat)), offset.coef = -Inf) summary(model) # tie-level interpretation; note the index for the second mode starts with 21 interpret(model, type = "tie", i = 1, j = 21) # dyad-level interpretation does not make sense because network is undirected; # node-level interpretation prints warning due to structural zeros, but # computes the correct probabilities (though slightly imprecise because -Inf # is approximated by some small number: interpret(model, type = "node", i = 1, j = 21:25) # compute all dyadic probabilities dyads <- edgeprob(model) dyads ## End(Not run)
##### The following example is a TERGM adaptation of the ##### ##### dyad-level example provided in figure 5(c) on page ##### ##### 424 of Desmarais and Cranmer (2012) in the PSJ. At ##### ##### each time step, it compares dyadic probabilities ##### ##### (no tie, unidirectional tie, and reciprocal tie ##### ##### probability) between a fitted model and a model ##### ##### where the reciprocity effect is fixed at 0 based ##### ##### on 20 randomly selected dyads per time step. The ##### ##### results are visualized using a grouped bar plot. ##### ## Not run: # create toy dataset and fit a model networks <- list() for (i in 1:3) { # create 3 random networks with 10 actors mat <- matrix(rbinom(100, 1, 0.25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network(mat) # create network object networks[[i]] <- nw # add network to the list } fit <- btergm(networks ~ edges + istar(2) + mutual, R = 200) # extract coefficients and create null hypothesis vector null <- coef(fit) # estimated coefs null[3] <- 0 # set mutual term = 0 # sample 20 dyads per time step and compute probability ratios probabilities <- matrix(nrow = 9, ncol = length(networks)) # nrow = 9 because three probabilities + upper and lower CIs colnames(probabilities) <- paste("t =", 1:length(networks)) for (t in 1:length(networks)) { d <- dim(as.matrix(networks[[t]])) # how many row and column nodes? size <- d[1] * d[2] # size of the matrix nw <- matrix(1:size, nrow = d[1], ncol = d[2]) nw <- nw[lower.tri(nw)] # sample only from lower triangle b/c samp <- sample(nw, 20) # dyadic probabilities are symmetric prob.est.00 <- numeric(0) prob.est.01 <- numeric(0) prob.est.11 <- numeric(0) prob.null.00 <- numeric(0) prob.null.01 <- numeric(0) prob.null.11 <- numeric(0) for (k in 1:20) { i <- arrayInd(samp[k], d)[1, 1] # recover 'i's and 'j's from sample j <- arrayInd(samp[k], d)[1, 2] # run interpretation function with estimated coefs and mutual = 0: int.est <- interpret(fit, type = "dyad", i = i, j = j, t = t) int.null <- interpret(fit, coefficients = null, type = "dyad", i = i, j = j, t = t) prob.est.00 <- c(prob.est.00, int.est[[1]][1, 1]) prob.est.11 <- c(prob.est.11, int.est[[1]][2, 2]) mean.est.01 <- (int.est[[1]][1, 2] + int.est[[1]][2, 1]) / 2 prob.est.01 <- c(prob.est.01, mean.est.01) prob.null.00 <- c(prob.null.00, int.null[[1]][1, 1]) prob.null.11 <- c(prob.null.11, int.null[[1]][2, 2]) mean.null.01 <- (int.null[[1]][1, 2] + int.null[[1]][2, 1]) / 2 prob.null.01 <- c(prob.null.01, mean.null.01) } prob.ratio.00 <- prob.est.00 / prob.null.00 # ratio of est. and null hyp prob.ratio.01 <- prob.est.01 / prob.null.01 prob.ratio.11 <- prob.est.11 / prob.null.11 probabilities[1, t] <- mean(prob.ratio.00) # mean estimated 00 tie prob probabilities[2, t] <- mean(prob.ratio.01) # mean estimated 01 tie prob probabilities[3, t] <- mean(prob.ratio.11) # mean estimated 11 tie prob ci.00 <- t.test(prob.ratio.00, conf.level = 0.99)$conf.int ci.01 <- t.test(prob.ratio.01, conf.level = 0.99)$conf.int ci.11 <- t.test(prob.ratio.11, conf.level = 0.99)$conf.int probabilities[4, t] <- ci.00[1] # lower 00 conf. interval probabilities[5, t] <- ci.01[1] # lower 01 conf. interval probabilities[6, t] <- ci.11[1] # lower 11 conf. interval probabilities[7, t] <- ci.00[2] # upper 00 conf. interval probabilities[8, t] <- ci.01[2] # upper 01 conf. interval probabilities[9, t] <- ci.11[2] # upper 11 conf. interval } # create barplots from probability ratios and CIs require("gplots") bp <- barplot2(probabilities[1:3, ], beside = TRUE, plot.ci = TRUE, ci.l = probabilities[4:6, ], ci.u = probabilities[7:9, ], col = c("tan", "tan2", "tan3"), ci.col = "grey40", xlab = "Dyadic tie values", ylab = "Estimated Prob./Null Prob.") mtext(1, at = bp, text = c("(0,0)", "(0,1)", "(1,1)"), line = 0, cex = 0.5) ##### The following examples illustrate the behavior of ##### ##### the interpret function with undirected and/or ##### ##### bipartite graphs with or without structural zeros. ##### library("statnet") library("btergm") # micro-level interpretation for undirected network with structural zeros set.seed(12345) mat <- matrix(rbinom(400, 1, 0.1), nrow = 20, ncol = 20) mat[1, 5] <- 1 mat[10, 7] <- 1 mat[15, 3] <- 1 mat[18, 4] < 1 nw <- network(mat, directed = FALSE, bipartite = FALSE) cv <- matrix(rnorm(400), nrow = 20, ncol = 20) offsetmat <- matrix(rbinom(400, 1, 0.1), nrow = 20, ncol = 20) offsetmat[1, 5] <- 1 offsetmat[10, 7] <- 1 offsetmat[15, 3] <- 1 offsetmat[18, 4] < 1 model <- ergm(nw ~ edges + kstar(2) + edgecov(cv) + offset(edgecov(offsetmat)), offset.coef = -Inf) summary(model) # tie-level interpretation (note that dyad interpretation would not make any # sense in an undirected network): interpret(model, type = "tie", i = 1, j = 2) # 0.28 (= normal dyad) interpret(model, type = "tie", i = 1, j = 5) # 0.00 (= structural zero) # node-level interpretation; note the many 0 probabilities due to the # structural zeros; also note the warning message that the probabilities may # be slightly imprecise because -Inf needs to be approximated by some large # negative number (-9e8): interpret(model, type = "node", i = 1, j = 3:5) # repeat the same exercise for a directed network nw <- network(mat, directed = TRUE, bipartite = FALSE) model <- ergm(nw ~ edges + istar(2) + edgecov(cv) + offset(edgecov(offsetmat)), offset.coef = -Inf) interpret(model, type = "tie", i = 1, j = 2) # 0.13 (= normal dyad) interpret(model, type = "tie", i = 1, j = 5) # 0.00 (= structural zero) interpret(model, type = "dyad", i = 1, j = 2) # results for normal dyad interpret(model, type = "dyad", i = 1, j = 5) # results for i->j struct. zero interpret(model, type = "node", i = 1, j = 3:5) # micro-level interpretation for bipartite graph with structural zeros set.seed(12345) mat <- matrix(rbinom(200, 1, 0.1), nrow = 20, ncol = 10) mat[1, 5] <- 1 mat[10, 7] <- 1 mat[15, 3] <- 1 mat[18, 4] < 1 nw <- network(mat, directed = FALSE, bipartite = TRUE) cv <- matrix(rnorm(200), nrow = 20, ncol = 10) # some covariate offsetmat <- matrix(rbinom(200, 1, 0.1), nrow = 20, ncol = 10) offsetmat[1, 5] <- 1 offsetmat[10, 7] <- 1 offsetmat[15, 3] <- 1 offsetmat[18, 4] < 1 model <- ergm(nw ~ edges + b1star(2) + edgecov(cv) + offset(edgecov(offsetmat)), offset.coef = -Inf) summary(model) # tie-level interpretation; note the index for the second mode starts with 21 interpret(model, type = "tie", i = 1, j = 21) # dyad-level interpretation does not make sense because network is undirected; # node-level interpretation prints warning due to structural zeros, but # computes the correct probabilities (though slightly imprecise because -Inf # is approximated by some small number: interpret(model, type = "node", i = 1, j = 21:25) # compute all dyadic probabilities dyads <- edgeprob(model) dyads ## End(Not run)
Longitudinal classroom friendship network and behavior (Andrea Knecht).
Note: the data have to be transformed before they can be used with btergm and related packages (see examples below).
friendship
is a list of adjacency matrices at four time
points, containing friendship nominations of the column node by the row
node. The following values are used: 0
= no, 1
= yes,
NA
= missing, 10
= not a member of the classroom (structural
zero).
demographics
is a data frame with 26 rows (the pupils) and four demographic variables about the pupils:
sex
(1
= girl, 2
= boy)
age
(in years)
ethnicity
(1
= Dutch, 2
= other, 0
=
missing)
religion
(1
= Christian, 2
=
non-religious, 3
= non-Christian religion, 0
= missing)
primary
is a 26 x 26 matrix indicating whether two pupils
attended the same primary school. 0
= no, 1
= yes.
delinquency
is a data frame with 26 rows (the pupils) and
four columns (the four time steps). It contains the rounded average of four
items (stealing, vandalizing, fighting, graffiti). Categories: frequency
over last three months, 1
= never, 2
= once, 3
= 2–4
times, 4
= 5–10 times, 5
= more than 10 times; 0
=
missing.
alcohol
is a data frame with 26 rows (the pupils) and 3
columns (waves 2, 3, and 4). It contains data on alcohol use (“How
often did you drink alcohol with friends in the last three months?”).
Categories: 1
= never, 2
= once, 3
= 2–4 times,
4
= 5–10 times, 5
= more than 10 times; 0
= missing.
advice
is a data frame with one variable, “school
advice”, the assessment given at the end of primary school about the school
capabilities of the pupil (4
= low, 8
= high, 0
=
missing)
The Knecht dataset contains the friendship network of 26 pupils in a Dutch school class measured at four time points along with several demographic and behavioral covariates like age, sex, ethnicity, religion, delinquency, alcohol consumption, primary school co-attendance, and school advice. Some of these covariates are constant while others vary over time.
The full dataset (see Knecht 2006 and 2008) contains a large number of classrooms while the dataset presented here is an excerpt based on one single classroom. This excerpt was first used in a tutorial for the software Siena and the corresponding R package RSiena (Snijders, Steglich and van de Bunt 2010). The following description was largely copied from the original data description provided on the homepage of the Siena project (see below for the URL).
The data were collected between September 2003 and June 2004 by Andrea Knecht, supervised by Chris Baerveldt, at the Department of Sociology of the University of Utrecht (NL). The entire study is reported in Knecht (2008). The project was funded by the Netherlands Organisation for Scientific Research NWO, grant 401-01-554. The 26 students were followed over their first year at secondary school during which friendship networks as well as other data were assessed at four time points at intervals of three months. There were 17 girls and 9 boys in the class, aged 11–13 at the beginning of the school year. Network data were assessed by asking students to indicate up to twelve classmates which they considered good friends. Delinquency is defined as a rounded average over four types of minor delinquency (stealing, vandalism, graffiti, and fighting), measured in each of the four waves of data collection. The five-point scale ranged from ‘never’ to 'more than 10 times', and the distribution is highly skewed. In a range of 1–5, the mode was 1 at all four waves, the average rose over time from 1.4 to 2.0, and the value 5 was never observed.
The data were gathered by Andrea Knecht, as part of her PhD research, building on methods developed by Chris Baerveldt, initiator and supervisor of the project. The project is funded by the Netherlands Organisation for Scientific Research NWO, grant 401-01-554, and is part of the research program "Dynamics of Networks and Behavior" with principle investigator Tom A. B. Snijders.
Complete original data: https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:48665
This excerpt in Siena format: http://www.stats.ox.ac.uk/~snijders/siena/klas12b.zip
Siena dataset description: http://www.stats.ox.ac.uk/~snijders/siena/tutorial2010_data.htm
Permission to redistribute this dataset along with this package was granted by Andrea Knecht on April 17, 2014. Questions about the data or the original study should be directed to her.
Knecht, Andrea (2006): Networks and Actor Attributes in Early Adolescence [2003/04]. Utrecht, The Netherlands Research School ICS, Department of Sociology, Utrecht University. (ICS-Codebook no. 61).
Knecht, Andrea (2008): Friendship Selection and Friends' Influence. Dynamics of Networks and Actor Attributes in Early Adolescence. PhD Dissertation, University of Utrecht. https://dspace.library.uu.nl/handle/1874/25950.
Knecht, Andrea, Tom A. B. Snijders, Chris Baerveldt, Christian E. G. Steglich, and Werner Raub (2010): Friendship and Delinquency: Selection and Influence Processes in Early Adolescence. Social Development 19(3): 494–514. doi:10.1111/j.1467-9507.2009.00564.x.
Leifeld, Philip and Skyler J. Cranmer (2019): A Theoretical and Empirical Comparison of the Temporal Exponential Random Graph Model and the Stochastic Actor-Oriented Model. Network Science 7(1): 20–51. doi:10.1017/nws.2018.26.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
Snijders, Tom A. B., Christian E. G. Steglich, and Gerhard G. van de Bunt (2010): Introduction to Actor-Based Models for Network Dynamics. Social Networks 32: 44–60. doi:10.1016/j.socnet.2009.02.004.
Steglich, Christian E. G. and Andrea Knecht (2009): Die statistische Analyse dynamischer Netzwerkdaten. In: Stegbauer, Christian and Roger Haeussling (editors), Handbuch der Netzwerkforschung, Wiesbaden: Verlag fuer Sozialwissenschaften.
## Not run: # ==================================================================== # The following example was taken from the JSS article about btergm # that is referenced above (Leifeld, Cranmer and Desmarais 2018). # ==================================================================== require("texreg") require("sna") require("btergm") require("RSiena") data("knecht") # step 1: make sure the network matrices have node labels for (i in 1:length(friendship)) { rownames(friendship[[i]]) <- 1:nrow(friendship[[i]]) colnames(friendship[[i]]) <- 1:ncol(friendship[[i]]) } rownames(primary) <- rownames(friendship[[1]]) colnames(primary) <- colnames(friendship[[1]]) sex <- demographics$sex names(sex) <- 1:length(sex) # step 2: imputation of NAs and removal of absent nodes: friendship <- handleMissings(friendship, na = 10, method = "remove") friendship <- handleMissings(friendship, na = NA, method = "fillmode") # step 3: add nodal covariates to the networks for (i in 1:length(friendship)) { s <- adjust(sex, friendship[[i]]) friendship[[i]] <- network(friendship[[i]]) friendship[[i]] <- set.vertex.attribute(friendship[[i]], "sex", s) idegsqrt <- sqrt(degree(friendship[[i]], cmode = "indegree")) friendship[[i]] <- set.vertex.attribute(friendship[[i]], "idegsqrt", idegsqrt) odegsqrt <- sqrt(degree(friendship[[i]], cmode = "outdegree")) friendship[[i]] <- set.vertex.attribute(friendship[[i]], "odegsqrt", odegsqrt) } sapply(friendship, network.size) # step 4: plot the networks pdf("knecht.pdf") par(mfrow = c(2, 2), mar = c(0, 0, 1, 0)) for (i in 1:length(friendship)) { plot(network(friendship[[i]]), main = paste("t =", i), usearrows = TRUE, edge.col = "grey50") } dev.off() # step 5: estimate TERGMS without and with temporal dependencies model.2a <- btergm(friendship ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary), R = 100) model.2b <- btergm(friendship ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), R = 100) # step 6: alternatively, estimate via MCMC-MLE: model.2d <- mtergm(friendship ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), control = control.ergm(MCMC.samplesize = 5000, MCMC.interval = 2000)) # step 7: GOF assessment with out-of-sample prediction # (note the commentaries and corrections at # https://doi.org/10.1017/nws.2022.7 and # https://doi.org/10.1017/nws.2022.6) model.2e <- btergm(friendship[1:3] ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), R = 100) gof.2e <- gof(model.2e, nsim = 100, target = friendship[[4]], formula = friendship[3:4] ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), coef = coef(model.2b), statistics = c(esp, dsp, geodesic, deg, triad.undirected, rocpr)) pdf("gof-2e.pdf", width = 8, height = 6) plot(gof.2e) dev.off() ## End(Not run)
## Not run: # ==================================================================== # The following example was taken from the JSS article about btergm # that is referenced above (Leifeld, Cranmer and Desmarais 2018). # ==================================================================== require("texreg") require("sna") require("btergm") require("RSiena") data("knecht") # step 1: make sure the network matrices have node labels for (i in 1:length(friendship)) { rownames(friendship[[i]]) <- 1:nrow(friendship[[i]]) colnames(friendship[[i]]) <- 1:ncol(friendship[[i]]) } rownames(primary) <- rownames(friendship[[1]]) colnames(primary) <- colnames(friendship[[1]]) sex <- demographics$sex names(sex) <- 1:length(sex) # step 2: imputation of NAs and removal of absent nodes: friendship <- handleMissings(friendship, na = 10, method = "remove") friendship <- handleMissings(friendship, na = NA, method = "fillmode") # step 3: add nodal covariates to the networks for (i in 1:length(friendship)) { s <- adjust(sex, friendship[[i]]) friendship[[i]] <- network(friendship[[i]]) friendship[[i]] <- set.vertex.attribute(friendship[[i]], "sex", s) idegsqrt <- sqrt(degree(friendship[[i]], cmode = "indegree")) friendship[[i]] <- set.vertex.attribute(friendship[[i]], "idegsqrt", idegsqrt) odegsqrt <- sqrt(degree(friendship[[i]], cmode = "outdegree")) friendship[[i]] <- set.vertex.attribute(friendship[[i]], "odegsqrt", odegsqrt) } sapply(friendship, network.size) # step 4: plot the networks pdf("knecht.pdf") par(mfrow = c(2, 2), mar = c(0, 0, 1, 0)) for (i in 1:length(friendship)) { plot(network(friendship[[i]]), main = paste("t =", i), usearrows = TRUE, edge.col = "grey50") } dev.off() # step 5: estimate TERGMS without and with temporal dependencies model.2a <- btergm(friendship ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary), R = 100) model.2b <- btergm(friendship ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), R = 100) # step 6: alternatively, estimate via MCMC-MLE: model.2d <- mtergm(friendship ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), control = control.ergm(MCMC.samplesize = 5000, MCMC.interval = 2000)) # step 7: GOF assessment with out-of-sample prediction # (note the commentaries and corrections at # https://doi.org/10.1017/nws.2022.7 and # https://doi.org/10.1017/nws.2022.6) model.2e <- btergm(friendship[1:3] ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), R = 100) gof.2e <- gof(model.2e, nsim = 100, target = friendship[[4]], formula = friendship[3:4] ~ edges + mutual + ttriple + transitiveties + ctriple + nodeicov("idegsqrt") + nodeicov("odegsqrt") + nodeocov("odegsqrt") + nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") + edgecov(primary) + delrecip + memory(type = "stability"), coef = coef(model.2b), statistics = c(esp, dsp, geodesic, deg, triad.undirected, rocpr)) pdf("gof-2e.pdf", width = 8, height = 6) plot(gof.2e) dev.off() ## End(Not run)
Plot marginal effects for two-way interactions in (T)ERGMs.
marginalplot( model, var1, var2, inter, ci = 0.95, rug = FALSE, point = FALSE, structzeromat = NULL, zeroline = TRUE, color = "black", xlab = NULL, ylab = NULL )
marginalplot( model, var1, var2, inter, ci = 0.95, rug = FALSE, point = FALSE, structzeromat = NULL, zeroline = TRUE, color = "black", xlab = NULL, ylab = NULL )
model |
An |
var1 |
Name of the first main variable. This is the focal variable. |
var2 |
Name of the second main variable. This is the conditioning variable. |
inter |
Name of the interaction effect. |
ci |
Significance level. |
rug |
Display the distribution of the conditioning variable at the bottom of the plot? |
point |
Display error bars for the levels of the conditioning variable (instead of a continuous curve)? |
structzeromat |
An optional matrix object which indicates dyads that should be deleted prior to the calculation of the confidence interval for the marginal effect curve. This is useful when such a matrix was used to indicate structural zeros during estimation. In this event, the dyads characterized by structural zeros are not allowed to be tied, therefore they should be removed from the set of dyads used for the calculation of marginal effects. The matrix should contain ones for structural zeros and zeros for entries that should be used. |
zeroline |
Draw a horizontal line to indicate zero for the first main variable? |
color |
Color of the curve, confidence interval, and distribution. |
xlab |
Axis label for the second (conditioning) variable. |
ylab |
Axis label for the first (focal) variable. |
The marginalplot
function creates marginal effects plots for ERGMs
with interaction effects. The user has to supply the ergm
object and
the coefficient names of the first main variable, the second main variable,
and the interaction term as stored in the coefficients vector inside the
ergm
object. It is possible to draw continuous curves or discrete
error bars depending on the nature of the data (using the point
argument). The distribution of the second (conditioning) variable can be
plotted at the bottom of the viewport using the rug
argument.
The resulting marginal effects plot is a ggplot2
plot. This means it
can be extended by plotting additional elements and using themes.
Other interpretation:
edgeprob()
,
interpret()
## Not run: # data preparation data("florentine") n <- network.size(flobusiness) wealth <- get.vertex.attribute(flobusiness, "wealth") priorates <- get.vertex.attribute(flobusiness, "priorates") wealth.icov <- matrix(rep(wealth, n), ncol = n, byrow = TRUE) priorates.icov <- matrix(rep(priorates, n), ncol = n, byrow = TRUE) interac <- wealth.icov * priorates.icov # estimate model with interaction effect model <- ergm(flobusiness ~ edges + esp(1) + edgecov(wealth.icov) + edgecov(priorates.icov) + edgecov(interac)) # plot the interaction (note the additional optional ggplot2 elements) marginalplot(model, var1 = "edgecov.wealth.icov", var2 = "edgecov.priorates.icov", inter = "edgecov.interac", color = "darkred", rug = TRUE, point = FALSE, xlab = "Priorates", ylab = "Wealth") + ggplot2::theme_bw() + ggplot2::ggtitle("Interaction effect") ## End(Not run)
## Not run: # data preparation data("florentine") n <- network.size(flobusiness) wealth <- get.vertex.attribute(flobusiness, "wealth") priorates <- get.vertex.attribute(flobusiness, "priorates") wealth.icov <- matrix(rep(wealth, n), ncol = n, byrow = TRUE) priorates.icov <- matrix(rep(priorates, n), ncol = n, byrow = TRUE) interac <- wealth.icov * priorates.icov # estimate model with interaction effect model <- ergm(flobusiness ~ edges + esp(1) + edgecov(wealth.icov) + edgecov(priorates.icov) + edgecov(interac)) # plot the interaction (note the additional optional ggplot2 elements) marginalplot(model, var1 = "edgecov.wealth.icov", var2 = "edgecov.priorates.icov", inter = "edgecov.interac", color = "darkred", rug = TRUE, point = FALSE, xlab = "Priorates", ylab = "Wealth") + ggplot2::theme_bw() + ggplot2::ggtitle("Interaction effect") ## End(Not run)
Estimate a TERGM by Markov Chain Monte Carlo Maximum Likelihood Estimation
mtergm(formula, constraints = ~., returndata = FALSE, verbose = TRUE, ...)
mtergm(formula, constraints = ~., returndata = FALSE, verbose = TRUE, ...)
formula |
Formula for the TERGM. Model construction works like in the
ergm package with the same model terms etc. (for a list of terms, see
|
constraints |
Constraints of the ERGM. See |
returndata |
Return the processed input data instead of estimating and
returning the model? In the |
verbose |
Print details about data preprocessing and estimation settings. |
... |
Further arguments to be handed over to the
|
The mtergm
function computes TERGMs by MCMC MLE (or MPLE with
uncorrected standard errors) via blockdiagonal matrices and structural zeros.
It acts as a wrapper for the ergm package. The btergm
function
is faster than the mtergm
function but is only asymptotically unbiased
the longer the time series. The mtergm
function yields unbiased
estimates and standard errors but may suffer from degeneracy if the model is
not specified in good keeping with the true data-generating process.
Philip Leifeld, Skyler J. Cranmer, Bruce A. Desmarais
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2017): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1-36. doi:10.18637/jss.v083.i06.
library("network") set.seed(5) networks <- list() for (i in 1:10) { # create 10 random networks with 10 actors mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network::network(mat) # create network object networks[[i]] <- nw # add network to the list } covariates <- list() for (i in 1:10) { # create 10 matrices as covariate mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } ## Not run: fit2 <- mtergm(networks ~ edges + istar(2) + edgecov(covariates)) summary(fit2) ## End(Not run) # For examples with real data, see help("knecht") or help("alliances").
library("network") set.seed(5) networks <- list() for (i in 1:10) { # create 10 random networks with 10 actors mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network::network(mat) # create network object networks[[i]] <- nw # add network to the list } covariates <- list() for (i in 1:10) { # create 10 matrices as covariate mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } ## Not run: fit2 <- mtergm(networks ~ edges + istar(2) + edgecov(covariates)) summary(fit2) ## End(Not run) # For examples with real data, see help("knecht") or help("alliances").
An S4 class to represent a fitted TERGM by MCMC-MLE.
## S4 method for signature 'mtergm' show(object) ## S4 method for signature 'mtergm' coef(object, invlogit = FALSE, ...) ## S4 method for signature 'mtergm' nobs(object) timesteps.mtergm(object) ## S4 method for signature 'mtergm' summary(object, ...)
## S4 method for signature 'mtergm' show(object) ## S4 method for signature 'mtergm' coef(object, invlogit = FALSE, ...) ## S4 method for signature 'mtergm' nobs(object) timesteps.mtergm(object) ## S4 method for signature 'mtergm' summary(object, ...)
object |
An |
invlogit |
Apply inverse logit transformation to the estimates and/or
confidence intervals? That is, |
... |
Currently not in use. |
mtergm
objects result from MCMC-MLE-based estimation of a TERGM via
the mtergm
function. They contain the coefficients, standard
errors, and p-values, among other details.
show(mtergm)
: Show the coefficients of an mtergm
object.
coef(mtergm)
: Return the coefficients of an mtergm
object.
nobs(mtergm)
: Return the coefficients of an mtergm
object.
timesteps.mtergm()
: Return the number of time steps saved in an
mtergm
object.
summary(mtergm)
: Return the coefficients of an mtergm
object.
coef
Object of class "numeric"
. The coefficients.
se
Object of class "numeric"
. The standard errors.
pval
Object of class "numeric"
. The p-values.
nobs
Object of class "numeric"
. Number of observations.
time.steps
Object of class "numeric"
. Number of time steps.
formula
Object of class "formula"
. The original model formula
(without indices for the time steps).
formula2
The revised formula with the object references after applying
the tergmprepare
function.
auto.adjust
Object of class "logical"
. Indicates whether
automatic adjustment of dimensions was done before estimation.
offset
Object of class "logical"
. Indicates whether an offset
matrix with structural zeros was used.
directed
Object of class "logical"
. Are the dependent networks
directed?
bipartite
Object of class "logical"
. Are the dependent networks
bipartite?
estimate
Estimate: either MLE or MPLE.
loglik
Log likelihood of the MLE.
aic
Akaike's Information Criterion.
bic
Bayesian Information Criterion.
ergm
The original ergm
object as estimated by the
ergm
function in the ergm package.
nvertices
Number of vertices.
data
The data after processing by the tergmprepare
function.
Philip Leifeld
Other tergm-classes:
btergm-class
,
createBtergm()
,
createMtergm()
,
createTbergm()
,
tbergm-class
btergm
ObjectSimulate networks from a btergm
object using MCMC sampler.
Simulate networks from an mtergm
object using MCMC sampler.
## S3 method for class 'btergm' simulate( object, nsim = 1, seed = NULL, index = NULL, formula = getformula(object), coef = object@coef, verbose = TRUE, ... ) ## S3 method for class 'mtergm' simulate( object, nsim = 1, seed = NULL, index = NULL, formula = getformula(object), coef = object@coef, verbose = TRUE, ... )
## S3 method for class 'btergm' simulate( object, nsim = 1, seed = NULL, index = NULL, formula = getformula(object), coef = object@coef, verbose = TRUE, ... ) ## S3 method for class 'mtergm' simulate( object, nsim = 1, seed = NULL, index = NULL, formula = getformula(object), coef = object@coef, verbose = TRUE, ... )
object |
A |
nsim |
The number of networks to be simulated. Note that for values
greater than one, a |
seed |
Random number integer seed. See set.seed. |
index |
Index of the network from which the new network(s) should be simulated. The index refers to the list of response networks on the left-hand side of the model formula. Note that more recent networks are located at the end of the list. By default, the last (= most recent) network is used. |
formula |
A model formula from which the new network(s) should be
simulated. By default, the formula is taken from the |
coef |
A vector of parameter estimates. By default, the coefficients are
extracted from the given |
verbose |
Print additional details while running the simulations? |
... |
Further arguments are handed over to the
|
The simulate.btergm
function is a wrapper for the
simulate_formula
function in the ergm package (see
help("simulate.ergm")
). It can be used to simulate new networks from a
btergm
object. The index
argument specifies from which of the
original networks the new network(s) should be simulated. For example, if
object
is an estimation based on cosponsorship networks from the 99th
to the 107th Congress (as in Desmarais and Cranmer 2012), and the
cosponsorship network in the 108th Congress should be predicted using the
simulate.btergm
function, then the argument index = 9
should be
passed to the function because the network should be based on the 9th network
in the list (that is, the latest network, which is the cosponsorship network
for the 107th Congress). Note that all relevant objects (the networks and the
covariates) must be present in the workspace (as was the case during the
estimation of the model).
Desmarais, Bruce A. and Skyler J. Cranmer (2012): Statistical Mechanics of Networks: Estimation and Uncertainty. Physica A 391: 1865–1876. doi:10.1016/j.physa.2011.10.018.
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2018): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1–36. doi:10.18637/jss.v083.i06.
## Not run: # fit a TERGM to some toy data library("network") set.seed(5) networks <- list() for(i in 1:10){ # create 10 random networks with 10 actors mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network(mat) # create network object networks[[i]] <- nw # add network to the list } covariates <- list() for (i in 1:10) { # create 10 matrices as covariate mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100) # simulate 12 new networks from the last (= 10th) time step sim1 <- simulate(fit, nsim = 12) # simulate 1 network from the first time step sim2 <- simulate(fit, index = 1) # simulate network from t = 5 with larger covariate coefficient coefs <- coef(fit) coefs["edgecov.covariates[[i]]"] <- 0.5 sim3 <- simulate(fit, index = 5, coef = coefs) ## End(Not run)
## Not run: # fit a TERGM to some toy data library("network") set.seed(5) networks <- list() for(i in 1:10){ # create 10 random networks with 10 actors mat <- matrix(rbinom(100, 1, .25), nrow = 10, ncol = 10) diag(mat) <- 0 # loops are excluded nw <- network(mat) # create network object networks[[i]] <- nw # add network to the list } covariates <- list() for (i in 1:10) { # create 10 matrices as covariate mat <- matrix(rnorm(100), nrow = 10, ncol = 10) covariates[[i]] <- mat # add matrix to the list } fit <- btergm(networks ~ edges + istar(2) + edgecov(covariates), R = 100) # simulate 12 new networks from the last (= 10th) time step sim1 <- simulate(fit, nsim = 12) # simulate 1 network from the first time step sim2 <- simulate(fit, index = 1) # simulate network from t = 5 with larger covariate coefficient coefs <- coef(fit) coefs["edgecov.covariates[[i]]"] <- 0.5 sim3 <- simulate(fit, index = 5, coef = coefs) ## End(Not run)
Estimate a TERGM using Bayesian estimation.
tbergm(formula, returndata = FALSE, verbose = TRUE, ...)
tbergm(formula, returndata = FALSE, verbose = TRUE, ...)
formula |
Formula for the TERGM. Model construction works like in the
ergm package with the same model terms etc. (for a list of terms, see
|
returndata |
Return the processed input data instead of estimating and
returning the model? In the |
verbose |
Print details about data preprocessing and estimation settings. |
... |
Further arguments to be handed over to the
|
The tbergm
function computes TERGMs by Bayesian estimation via
blockdiagonal matrices and structural zeros. It acts as a wrapper for the
bergm
function in the Bergm package.
Philip Leifeld
Caimo, Alberto and Nial Friel (2012): Bergm: Bayesian Exponential Random Graphs in R. Journal of Statistical Software 61(2): 1-25. doi:10.18637/jss.v061.i02.
An S4 class to represent a fitted TERGM using Bayesian estimation.
## S4 method for signature 'tbergm' show(object) ## S4 method for signature 'tbergm' nobs(object) timesteps.tbergm(object) ## S4 method for signature 'tbergm' summary(object, ...)
## S4 method for signature 'tbergm' show(object) ## S4 method for signature 'tbergm' nobs(object) timesteps.tbergm(object) ## S4 method for signature 'tbergm' summary(object, ...)
object |
A |
... |
Further arguments for the |
tbergm
objects result from Bayesian estimation of a TERGM using the
tbergm
function. They contain the original bergm
object
and some additional information.
show(tbergm)
: Show the coefficients of a tbergm
object.
nobs(tbergm)
: Return the number of observations saved in a
tbergm
object.
timesteps.tbergm()
: Return the number of time steps saved in a
tbergm
object.
summary(tbergm)
: Summary of a fitted tbergm
object.
time.steps
Object of class "numeric"
. Number of time steps.
formula
Object of class "formula"
. The original model formula
(without indices for the time steps).
formula2
The revised formula with the object references after applying
the tergmprepare
function.
auto.adjust
Object of class "logical"
. Indicates whether
automatic adjustment of dimensions was done before estimation.
offset
Object of class "logical"
. Indicates whether an offset
matrix with structural zeros was used.
directed
Object of class "logical"
. Are the dependent networks
directed?
bipartite
Object of class "logical"
. Are the dependent networks
bipartite?
estimate
Estimate: "bergm"
for Bayesian estimation.
bergm
The original bergm
object as estimated by the
bergm
function in the Bergm package.
nvertices
Number of vertices.
data
The data after processing by the tergmprepare
function.
Philip Leifeld
Other tergm-classes:
btergm-class
,
createBtergm()
,
createMtergm()
,
createTbergm()
,
mtergm-class
Network statistics that span multiple time points.
Transform a covariate using a function of time.
timecov( covariate, minimum = 1, maximum = length(covariate), transform = function(t) 1 + (0 * t) + (0 * t^2), onlytime = FALSE )
timecov( covariate, minimum = 1, maximum = length(covariate), transform = function(t) 1 + (0 * t) + (0 * t^2), onlytime = FALSE )
covariate |
The list of networks or matrices for which to create a time
covariate. This can be the list of networks on the left-hand side of the
formula, in which case a time trend is created as a covariate list of
matrices, or it can be a list of networks or matrices that is used as a
dyadic covariate on the right-hand side of the formula, in which case an
interaction effect between the time trend and the covariate is created. If
used as a model term inside a formula, |
minimum , maximum
|
For time steps below the |
transform |
In the default case, edges are modeled as being linearly
increasingly important over time (i.e., a linear time trend). By tweaking
the |
onlytime |
If |
In addition to the ERGM user terms that can be estimated within a single network (see ergm-terms), the btergm package provides additional model terms that can be used within a formula. These additional statistics span multiple time periods and are therefore called "temporal dependencies." Examples include memory terms (i.e., positive autoregression, dyadic stability, edge innovation, or edge loss), delayed reciprocity or mutuality, and time covariates (i.e., functions of time or interactions with time):
delrecip(mutuality = FALSE, lag = 1)
The delrecip
term
checks for delayed reciprocity. For example, if node j
is tied to
node i
at t = 1
, does this lead to a reciprocation of that
tie back from i
to j
at t = 2
? If
mutuality = TRUE
is set, this extends not only to ties, but also
non-ties. That is, if i
is not tied to j
at t = 1
,
will this lead to j
not being tied to i
at t = 2
, in
addition to positively reciprocal patterns over time? The lag
argument controls the size of the temporal lag: with lag = 1
,
reciprocity over one consecutive time period is checked. Note that as
lag
increases, the number of time steps on the dependent variable
decreases.
memory(type = "stability", lag = 1)
Memory terms control for
the impact of a previous network on the current network. Four different
types of memory terms are available: positive autoregression
(type = "autoregression"
) checks whether previous ties are carried
over to the current network; dyadic stability (type = "stability"
)
checks whether both edges and non-edges are stable between the previous
and the current network; edge loss (type = "loss"
) checks whether
ties in the previous network have been dissolved and no longer exist in
the current network; and edge innovation (type = "innovation"
)
checks whether previously unconnected nodes have the tendency to become
tied in the current network. The lag
argument accepts integer
values and controls whether the comparison is made with the previous
network (lag = 1
), the pre-previous network (lag = 2
) etc.
Note that as lag
increases, the number of time steps on the
dependent variable decreases.
timecov(x = NULL, minimum = 1, maximum = NULL,
transform = function(t) t)
The timecov
model term checks for
linear or non-linear time trends with regard to edge formation.
Optionally, this can be combined with a covariate to create an
interaction effect between a dyadic covariate and time in order to test
whether the importance of a covariate increases or decreases over time.
In the default case, edges modeled as being linearly increasingly
important over time. By tweaking the transform
function,
arbitrary functional forms of time can be tested. For example,
transform = sqrt
(for a geometrically decreasing time effect),
transform = function(x) x^2
(for a geometrically increasing time
effect), transform = function(t) t
(for a linear time trend) or
polynomial functional forms (e.g., 0 + (1 * t) + (1 * t^2)
) can
be used. For time steps below the minimum
value and above the
maximum
value, the time covariate is set to 0. These arguments
can be used to create step-wise, discrete effects, for example to use a
value of 0 up to an external event and 1 from that event onwards in
order to control for influences of external events.
The timecov
model term checks for linear or non-linear time trends
with regard to edge formation. Optionally, this can be combined with a
covariate to create an interaction effect between a dyadic covariate and time
in order to test whether the importance of a covariate increases or decreases
over time. The function can either be used in a formula with
btergm
, mtergm
, or tbergm
, or it
can be executed directly for manual inclusion of the results as a covariate.
timecov()
: Time trends and temporal covariate interactions
Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2017): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software 83(6): 1-36. doi:10.18637/jss.v083.i06.
Prepare data structure for TERGM estimation, including composition change.
tergmprepare(formula, offset = TRUE, blockdiag = FALSE, verbose = TRUE)
tergmprepare(formula, offset = TRUE, blockdiag = FALSE, verbose = TRUE)
formula |
The original formula provided by the user, given the data structures in the workspace. |
offset |
Indicates whether absent nodes should be added where they are
missing ( |
blockdiag |
Should the time steps be arranged in a blockdiagonal matrix
for use with MCMC-MLE or Bayesian estimation ( |
verbose |
Report details about dimension adjustment? |
This is a helper function that adjusts the dimensions of networks or
covariates within a given time step to each other by removing nodes that are
not present across all objects within a time step or by adding nodes where
they are missing (and simultaneously adding entries to a list of structural
zero matrices to indicate their absence). It is not necessary to have
identical (numbers of) nodes across time steps as long as the dimensions of
the matrices, networks, and vectors match cross-sectionally within time
steps, given that temporal dependency terms like memory are interpreted as
dyadic covariates in a given time step. This helper function also creates
these dyadic covariate data structures for some of the custom temporal model
terms, such as memory
and delrecip
. Leifeld, Cranmer and
Desmarais (2018) contain additional details on composition change, dimension
adjustment of matrices, and temporal dependencies. Note that this function
should not normally be used by the end user. It is automatically called
internally by the estimation functions to make the dimensions of all objects
conformable to each other for estimation. Use this function only for
diagnostic purposes!
A list with the following slots:
A character object containing the original name of the
object on the left-hand side of the formula provided by the user. This is
saved here because the formula is manipulated such that the left-hand
side of the formula contains a new item networks[[i]]
.
The list of networks on the left-hand side of the formula
after dimension adjustment, or a blockdiagonal network representing the
left-hand side of the formula after dimension adjustment if argument
blockdiag = TRUE
was used.
The maximum number of nodes of any time point after adjustment of dimensions.
Are the networks directed?
Are the networks bipartite?
The formula after manipulation and adjustment of the data,
including networks[[i]]
on the left-hand side and an added offset
covariate on the right-hand side of the formula, in addition to added
indices for the covariate terms.
The number of time steps of the dataset.
The right-hand side of the formula after adjustment, as
a vector of character
objects representing the terms.
A character
vector containing the names of the
objects in which the networks and covariates are stored, according to the
manipulated formula. This includes "networks"
(for the left-hand
side of the formula) and all objects containing exogenous covariates on
the right-hand side of the formula after manipulation.
Each of the covariates mentioned in the slot covnames
is
stored as an element of the list, either as a list of matrices or
networks (if blockdiag = FALSE
) or as a matrix or network object
(if blockdiag = TRUE
).
Did the function have to adjust the dimensions of the networks or covariates at all?
A matrix containing the number of nodes in the rows and columns of each object at each time step, after adjustment.
A list of offset covariate matrices or a large blockdiagonal
offset covariate matrix containing structural zeros. If
offset = FALSE
, this matrix or list of matrices will contain only
zeros. If offset = TRUE
, they will contain ones where nodes were
absent in the original data.
Philip Leifeld