Notes from the weekly DAS/2 teleconference, 22 May 2006 $Id: das2-teleconf-2006-05-22.txt,v 1.2 2006/05/22 19:43:08 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Ed Erwin, Gregg Helt CSHL: Lincoln Stein U Alabama: Ann Loraine UCLA: Allen Day Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Topic: status reports --------------------- aday: working on converting xml to sql for the writeback functionality. Need to bind to apache. should be done today. gh: would like to test writeback into IGB. aday: will give url to gregg. if you post an xml doc to that url. will get back a response mapping provisional names to persisted names in the db. gh: editing too? aday: no, creation only, at least today. maybe editing later this week. gh: I could get some writeback for creation in igb this week. with effort. aday: currently writes yeast and human features to the same data source. gh: fine for testing. will focus on human. sc: probably should just send to url to gregg then to send to list once we test it a bit. [A] allen will send info to gregg re: accessing his writeback server gh: turning on curation in igb. need to connect to das writeback. previously wasn't connected. given I have a das writer it should go smoothly. aiming to get something going this week. Not checked in. Still struggling with sf cvs change. other news: applying for no cost extension to current grant. have 100-200k left. still determining. need to get app for extension in today. this is more than we though. should last at least another couple of month. will let folks know when we know. ls: pretty good. gh: will contact p. good about bridge funding until new grant is funded. working with suzi to re-submit in October with her as PI. sc: sf cvs trouble? gh: hangs my jbuilder ide now. I updated repository information in CVS Root files using perl. JBuilder doesn't like it. sc: I did the same sort of perl-based editing of CVS Root files and it worked for me. al: Doing overrepresentation analysis for QTL studies. Usually this is for microarray analysis, where you want to see what things the expressed genes they have in common. you collect GO annotations, if you see some recurring in the list, you infer that that is important/involved in the process. We want to do overrepresentation alaysis for gene assoc studies. look for go terms/ pathways represented in the list of gene associations. having trouble getting a null model. plan to hammer on a das server to get a null distribution. would that be ok? thinking about using the UCSC das server. gh: would be a great test to see how they handle lots of queries. ls: also try the biomart interface at ensembl. das feeds into it internally. there are api's for variety of languages, or can send sql to it. try the ucsc das server first. al: is there a contact person for ucsc das server? ls: jim kent. aday: write to genome mailing list. they're responsive. al: top page of das server has message that implies it's not well-supported. gh: the things you're interested in will do fine (txts, rna, gene predictions). doesn't do well on extra stuff in their db (fastcons conservation scores across regions of conservation, expression score, etc). For basic annotations it works fine. al: most interested in the "knownGene" data type. das server gives back accessions. annotations are encoded. need to look up entrez gene id from accession. using NCBI mapping files to get mRNA accessions. then get ontology annotations, also distributed via NCBI. gh: in das/2 this is handled better with xids (this das id = that entrez id). al: will ucsc update their server to das/2? gh: more likely after grant renewal. al: affy server? sc: we have knownGene annotations. al: doing this for a statistical genetics class this fall. expect lots of hits on the server then. [A] Ann will test hammering on the affy das/2 server before turning her students loose on it this Fall. sc: no writeback related work, but attended the JavaOne meeting last week. Some things of interest to our work: 1) Presentation on the ATOM protocol and REST in general. Mentioned something called WADL that is a REST-based web service to automate client-side interactions. Seemed more general that ATOM since one can use it to implement ATOM. 2) Eclipse RCP (rich client platform). Is a gutted version of the Eclipse Java IDE with all the java tools removed. You're left with lots of core functionality that can be of use for any java app. Was used by NASA when they needed to integrate lots of heterogeneous apps related to the Mars space mission. Seemed like we have a similar situation with IGB + Apollo integration. The Eclipse RCP gives you a framework for assembling tightly integrated components that are loosely coupled. gh: do all the components need to be eclipse plugins? sc: yes (believe so). not sure how much work that entails. gh: I looked at using eclipse and netbeans back when starting to develop IGB, but that was a long time ago. might be worth looking again. sc: there's a new book on building Eclipse RCP apps. On my desk. [A] gregg will look into the eclipse RCP as a possible app framework. 3) Extreme GUI makeover talk was interesting to see what you can do with Java 2D to dress up a "plain-jane" application. This year, they created a suped-up version of the Thunderbird email app. There are other enhancements coming with the next java release that improve java on the desktop (version 6 called Mustang, due in October). 4) java.util.concurrent package in java 5.0 has quite a lot of functionality for implementing things in your app that require multithreading support. There is a back port to JDK 1.4. 5) Scripting in java was very big. Using the groovy scripting language, with simplified syntax but access to all of java. AJAX was also very hot. 6) Probably the most relevant to this group was a talk called, "How to write an API that will stand the test of time". While it was definitely java-centric, there are some nuggets that apply to any software or spec writing team: * Be use-case driven - focus on what people need now, not what might be useful later * Expose only necessary functionality - minimize your API's "surface area". This means less for users to learn, and less for devs to implement and maintain. * Be predictable and consistent - an interface that is predictable serves better than one which is locally optimal but inconsistent within the framework. * Design to test - write tests to fully cover API functionality. * Always think about evolution - the first version is never perfect. - allow for the co-existence of multiple versions. * An anti-example: The JavaMail API, which is optimized for implementing new wire protocols, not for the bulk of potential users who just want to read/send mail. * Links: - http://openide.netbeans.org/tutorial/api-design.html - http://www.artima.com/apidesign/index.html gh: I saw something using gbrowse and AJAX. ls: this is an ian holmes project. It has google maps like interface. change track order by tracking, smooth scroll and and zooming. the idea is to supercede igb so people can do it on the web. al: you need to do 1d not 2d zooming. ls: it does that. gh: the hard part is dealing with lots of data. al: I always get questions from people who look at google maps and say, "how come you genome people can't do something like this?" ls: it's running now. go to http://genome.biowiki.org/ or look for "gbowse ajax client" on biowiki.org. Topic: Next meeting -------------------- gh: next monday is US holiday (memorial day, 5/29). next meeting will therefore be in two weeks.