Research: Aging Gene Communities
The experiment
I wanted to try seeing if there was a central mechanism for aging where all the aging genes are clustered together and talk with each other.
The data
The regulatory gene paths are downloaded from https://grand.networkmedicine.org/tissues/ and I used the skeletal muscle dataset.
The way I converted between ENSG ids and actual gene names I used https://ftp.ensembl.org/pub/release-74/gtf/homo_sapiens/.
The code
The code can be found here: https://github.com/luojxxx/Aging-community
Basically I took the gene regulatory network and ran the Louvain algorithm on it at increasing resolution to separate it into communities, so that I had a everything from all genes are in one community to each gene is in it’s own community. Then to decide which resolution (and splitting up of the genes into communities) is the optimal one: Each gene community is scored based on it’s average score formed from the aging gene relatedness metric from the previous research divided by the number of genes in that community. Then I take the average scores from each community and determine the standard deviation for that particular split. The rationale being a high standard deviation would have some communities with very high scores and some with very low ones, thus separating the aging genes from the rest. Furthermore the standard deviation is divided by the number of communities to penalize the score for having too many communities.
The results
The scoring of the different ways to partition the genes is below:
As you can see the highest scoring partition is the 2nd one with 22 communities. This is a very coarse grained resolution, but it could still be useful because if all the aging related genes are in one cluster it still tells us something. So the next step is to see the average scores of each community, which is below (the number on the left is the community ID number and the right is the score):
As we can see there is some concentration of aging genes in particular communities compared to others but it isn’t a wild multiple times concentration. And at this point, I put away this research considering it a null result. However, later on I realized that we still learned something from this. That within this GRN and metric, aging-relatedness is consistent with a diffuse, multi-module architecture.
Either way I decided to still share this research as an interesting attempt while still learning something.