Tutorial

Contents

Tutorial#

Using TFitPy to evaluate the performance of a set of Co Regulators for a given target

Getting things ready#

The first step is downloading required dataset and generating processing files. This is a one time step after every time a new version is installed. You need to specify a path of a folder where data will be stored. This can be reused.

from tfitpy.datasets import install
data_path = "$HOME/datasets"
install(data_path)

We are now read to evaluate out TFs!

Data#

In this tutorial we illustrate the use of the package on a potential set of coregtulators

data1 = {
    "sources":["IKZF4","TBP","ZNF841"],
    "target":"CLK3"
}

Load datasets cache#

# Set the folder path where data was stored during setup
from pathlib import Path
import os
folder_path = os.path.expandvars(data_path)

import tfitpy as tt
cache = tt.load_cache(data_path=folder_path)

import importlib
importlib.reload(tt)

PPI based scores#

importlib.reload(g)

GO Functional Similarity#

import  tfitpy.indices.go  as g

s,df = g.goa_resnik_similarity(data1["sources"],datasets=cache)

1.3942496226771082

df

	tf1	tf2	score	n_terms_tf1	n_terms_tf2
0	IKZF4	TBP	1.686905	17	36
1	IKZF4	ZNF841	1.573010	17	7
2	TBP	ZNF841	0.922834	36	7

gene2go = cache["go"]["gene2go"]
print(list(gene2go.keys())[:5])          # Should now be ['TP53', 'BRCA1', ...]

for gene in data1["sources"] + [data1["target"]]:
    print(f"{gene}: {len(gene2go.get(gene, set()))} GO terms")

['NUDT4B', 'TRBV20OR9-2', 'IGKV3-7', 'IGKV1D-42', 'IGLV4-69']
IKZF4: 17 GO terms
TBP: 36 GO terms
ZNF841: 7 GO terms
CLK3: 21 GO terms