Tag Archives: Bioinformatics

Bioinformatics In Bengal

Dept. Biochemistry University of Dhaka

Visiting Bengal for the holidays I didn’t expect a thriving bioinformatics community. Yet, that’s exactly what I found when Dr. Haseena Khan invited me to visit her lab at The University of Dhaka. The Jute Genome Project was a consortium of academia, industry, and government which had sequenced & analyzed the Jute plant.

What Dr. Khan and her researchers lacked in cutting-edge equipment, they made up in passion, ingenuity & thorough knowledge of the most miniscule advancements in the field. After spending the day with them Dr. Khan insisted I meet with the industrial wing of the project.

Tucked away amidst one of the most clustered places on the planet, there are a few small buildings covered in plants, within them incredible things are happening.

Lush green home of scientists, developers & supercomputers at DataSoft

DataSoft Systems Ltd. created a sub-division, Swapnojaatra (dream journey) which would “put scientists, developers, and supercomputers in one room and throw away the key” as Palash the Director of Technology for DataSoft would tell me. Although the Jute Genome Project is now complete, the developers of Swapnojaatra are hooked on informatics. From the minute we met they were excited to show what they had done (within lines of existing NDAs) and ask what was new in the field from San Francisco. Indeed, the team here had discovered genomic re-assortment of the influenza virus, performed molecular docking studies of pneumonia and created many of their own informatics tools.

For a well-educated, computer savvy, developing region bioinformatics is a near perfect industry. With low overhead costs, compared to traditional wet-lab sciences and endless data being generated in more economically developed countries, it’s only a matter of time. Bengal and bioinformatics may have been made for each other.

 

Citations:

A Putative Leucine-Rich Repeat Receptor-Like Kinase of Jute Involved in Stress Response (2010) by MS Islam, SB Nayeem, M Shoyaib, H Khan DOI: 10.1007/s11105-009-0166-4

Molecular-docking study of capsular regulatory protein in Streptococcus pneumoniae portends the novel approach to its treatment (2011) by S Thapa, A Zubaer DOI:10.2147/OAB.S26236

Palindromes drive the re-assortment in Influenza A (2011) by A Zubaer, S Thapa ISSN 0973-2063


Leave a comment

Filed under Genomics, Microbiology

Drug Development from Binary to Gradient Model

Earlier this year a study by the Center for the Study of Drug Development at Tufts University placed the cost of developing a new drug at $1.3 billion [1].

Distribution of Development Funding

Though the number is contested by other researchers [2], it is well within the trend of pervious studies and has now been widely accepted as an industry wide average. Exacerbating the issue is the all or nothing nature of drug development, where failure during any phase of clinical trials can cause the termination of a project. It is therefore advantageous to consider technologies that will reduce the risk of this binary success/failure model and transition to a gradient definition of therapeutic efficacy.

Trending Costs of Drug Development

Much of the high costs come in during phase 2 & 3 trials, where patient care, clinical production and regulatory leg-work consumes funds at an alarming rate. With everything riding on the individual trial subjects, their well-being directly linked to success. Undesirable reactions to experimental treatments is unavoidable and the margins for serious adverse events is kept tight by regulatory agencies to protect healthcare consumers. Often however, ground-breaking treatments have to be shelved because they affect 10-15% of trial subjects detrimentally.

RD costs of new chemical entity (NCE)

This makes any ability to view trial subjects with increased resolution and discern subtle correlations with their reactions to consumer demographics key in cutting risks of total-loss. Here I hope a story about my own experience is helpful, as I know it better than what anyone else has had to dealt with. My time at Novartis began when I was brought on-board to help with the development of a drug entering a repeat Phase IIB trial, as the first time around approximately 15% of subjects showed an adverse reaction of note.

Draft FDA Guidance on DNA Sequencing & Clinical Trials

Soon however folks began to get cold-feet, do we dump further resources behind this project or cut our losses and iterate to the next project. A third option now becoming available is that perhaps there was something specific to those 15% of patients that caused the unwanted reaction. Identifying this would allow the drug to move along its pipeline with contraindications that covered the failing demographics. No longer limiting projects to pass/fail while hedging development risks.

Citations:
DiMasi et al,(2003) The price of innovation: new estimates of drug development costs
Ernst & Young Global Pharmaceutical Industry Report (2011) Progressions Building Pharma 3.0
Tufts Center for the Study of Drug Development (2011) Outlook 2011 report

Leave a comment

Filed under BigPharma, Genomics

The Fall in Gov Funding & Rise of Privatization in Genome Databases

Government Funded Sequence Database

As the spaceshuttle program comes to an end we are reminded of the role of goverments in birthing industries. And just like the manned space program, genomics has been mostly government funded and just like the space program it’s about to take a big hit:

Recently, NCBI announced that due to budget constraints, it would be discontinuing its Sequence Read Archive (SRA) and Trace Archive repositories for high-throughput sequence data. However, NIH has since committed interim funding for SRA in its current form until October 1, 2011.

With its fall there will be few if any (Japan: DDBJ & Europe: ERA) centralized public databases for nextgen sequencing. Once again we’re left to ride with the ruskies, figuratively speaking. Enter private industry and its first batter, from the SF Bay Area, DNA Nexus. Though Sundquist and his team have managed to create a very well polished and modern platform, unlike SRA there is no data aggregation. There is no public pool where researchers can access data. This is a problem in that much of the power of genomics comes from studying statistical trends and a large, public data pool is to date the best way to make any sense of what our genes say.

A similar private effort from the great innovators across the ocean comes in the form of Simbiot by Japan Bioinformatics K.K. At the moment Simbiot is edging a lead as they’ve recently released two mobile applications allowing sequence data management and analysis on the go. However, just as with DNA Nexus users are only given the option to keep their data within their own accounts or share with select others. Both of the aforementioned companies have well thought-out price plans, sleek interfaces and well produced videos. But what makes government efforts like the SRA valuable is that for a time they provided a centralized public data pool.

Said Infamous Graph

As anyone who’s seen the now infamous graph of the rate of decrease in sequencing costs vs that of Moore’s law will likely have figured out by now, the costs associated with maintaining a sequencing database only increases with adoption of the technology. As such, it was reasonable for Uncle Sam to pay for this party at first but the costs rise every year, by leaps. There must be a private model that is both aggregate & open in nature but can also pull it’s own weight in terms of cost; the future of healthcare and any form of “genomics industry” may well be dependant on it.

2 Comments

Filed under Genomics

Decided? No, we just finished saying Good Morning: Sage Congress 2011

“Therefore a sage has said, ‘I will do nothing (of purpose), and the people will be transformed of themselves; I will be fond of keeping still, and the people will of themselves become correct. I will take no trouble about it, and the people will of themselves become rich; I will manifest no ambition, and the people will of themselves attain to the primitive simplicity’ ”  reads Ch. 57 of the Tao Te Ching. How chillingly the 2 millennia old caricature of a wise-learned man holds true to this day.

Sage Bionetworks is a medical research organization, whose goal is “to share research and development of biological network models and their application to human disease and biology.” To this end, top geneticists, clinicians, computer scientists and pharmaceutical researchers gathered this weekend at UC San Francisco. We were given an inspirational speech by a cancer survivor, followed by report of the progress since last years congress. Although admirable on their own, the research and programs built in the last year seemed to remind us all again that in silico research was still closer to the speed of traditional life-science than the leaps and bounds by which the internet moves.

Example of an effort which aligns with & was presented at Sage

Projects like GenomeSpace by the Broad Institute give us hope of what’s possible while watching hours of debate and conjecture at Sagecon.  There were many distinguished scientists, authors , nobel laureates and government representatives, the totality of whose achievement here was coming to agreement on what should be built, who should build it and by when. Groups were divided into subgroups, and then those divided yet again. All the little policy details, software choices and even funding options would be worked out. There was a lot of talk.

Normal Conference VS Developer Conference. SHDH Illustrated by Derek Yu

Attending gatherings for software developers in silicon valley, their hackathons leave much to be desired at events like Sagecon, the least of which being the beer. I doubt anyone enjoys sitting in a stuffy blazer listening to talks for hours on end. The hacker events are very informal, there is no set goal, yet by the end of 24 hours there are often great new programs, friendships and even companies formed. Iteration rate is key to finding solutions and the rate-limiting step in the life-sciences & medicine isn’t the talent or resources it’s the culture; an opinion echoed by Sages’ own shorts-wearing heroes Aled Edwards & Eric Schadt.

“You must understand, young Hobbit, it takes a long time to say anything in Old Entish. And we never say anything unless it is worth taking a long time to say. “

Leave a comment

Filed under Genomics, Microbiology

Library of Life: Genomic Databases & Browsers

DNA at it’s heart is enormous chunks of information. The genome of an organism like  yeast, mice or humans contains an ocean of data. Currently there are several on-line genomic databases, a great example being SGD dedicated to the yeast S. cerevisiae. SGD has become a necessary tool for life-scientist over the past 10  years but at the same time has not kept up with information technology, resulting in a platform which works like a 10 year old website.

SGD is clunky but necessary, for now

Above we see a typical SGD search, it takes  5 windows to arrive at the sequence data of 1 gene. Nevertheless, SGD is used by drug companies trying to find the next big hit, academic labs trying to cure cancer and field biologists studying wildlife.

DNA is extracted and placed through a sequencing machine which spits out the information into a computer file.  Just as having an aged internet browser affects our productivity the browser one uses to view these files can have a large impact. Following the web-browser analogy we take a look at 3 different sequence browsers, starting with Vector NTI.

Vector NTI is enterprise software.

Vector NTI is well established and often bundled with hardware. It has many features but can often seem like information overload, causing most users to stumble through it’s many menus and windows. A step up in usability comes from the third-party software suite Sequencher, popular amongst mac users.

Sequencher is your friend

Sequencher strikes a healthy balance between features and usability. But is a fairly resource intensive program requiring CDs and hard drive space to store local algorithms. However, the most up to date browser is likely to be the free and light download, 4Peaks.

4Peaks Simplicity & Usability

4Peaks allows the user to go in, read their sequence file and get out. What it lacks in features it makes up for in simplicity. The end result of any software or database is to help researchers wade through all this information and continue their studies. In this environment services such as GENEART offers to perform much of the genomic related leg work on a given project.

These are all tools, the databases, browsers and services, which enable researchers to answer the questions that line our horizon. The progress of our tools has always directly correlated with our advancement, the life sciences adoption of information technology is a necessity as we discover so much of life is condensed data in every nook.

1 Comment

Filed under Genomics, Microbiology, PCR